Error in integrating Yolov4 in Deepstream 6, 6.1, 6.1.1, and 6.2

pallavi.halarnkar · March 15, 2023, 7:06pm

• Hardware : A10 GPU
• Network Type : Yolov4

Cuda version used: 11.6
Tensorrt : 8.2.5.1
CudDNN: cudnn-local-repo-ubuntu2004-8.6.0.163_1.0-1_amd64.deb

Following are details of all the options we have tried in integrating Yolov4 model in the above mentioned Deepstream versions, However we have been facing one or the other error.

Trained Yolov4 on custom dataset using Tao 3.21.08.
Exported model using tao-converter in two ways

a. Using tao converter from tao toolkit 3.21.08 converted model failed to deploy
b. using Tao-converter-x86-tensorrt8.2 converted model failed
libnvinfer_plugin.so creation failed outside container, as cmake gives the following error

 root@nvmbdprp023229:~/TensorRT/build# /usr/local/bin/cmake .. -DGPU_ARCHS=86  -DTRT_LIB_DIR=/usr/lib/x86_64-linux-gnu/ -DCMAKE_C_COMPILER=/usr/bin/gcc -DTRT_BIN_DIR=`pwd`/out
Building for TensorRT version: 8.5.3, library version: 8
-- The CXX compiler identification is unknown
-- The CUDA compiler identification is unknown
-- Check for working CXX compiler: /usr/local/bin/g++
-- Check for working CXX compiler: /usr/local/bin/g++ -- broken
CMake Error at /usr/local/share/cmake-3.13/Modules/CMakeTestCXXCompiler.cmake:45 (message):
  The C++ compiler     "/usr/local/bin/g++"   is not able to compile a simple test program.   It fails with the following output:     Change Dir: /root/TensorRT/build/CMakeFiles/CMakeTmp     Run Build Command:"/usr/local/bin/make" "cmTC_230cd/fast"
    /usr/local/bin/make -f CMakeFiles/cmTC_230cd.dir/build.make CMakeFiles/cmTC_230cd.dir/build
    make[1]: Entering directory '/root/TensorRT/build/CMakeFiles/CMakeTmp'
    Building CXX object CMakeFiles/cmTC_230cd.dir/testCXXCompiler.cxx.o
    /usr/local/bin/g++     -o CMakeFiles/cmTC_230cd.dir/testCXXCompiler.cxx.o -c /root/TensorRT/build/CMakeFiles/CMakeTmp/testCXXCompiler.cxx
    g++: fatal error: cannot execute ‘cc1plus’: execvp: No such file or directory
    compilation terminated.
    make[1]: *** [CMakeFiles/cmTC_230cd.dir/build.make:66: CMakeFiles/cmTC_230cd.dir/testCXXCompiler.cxx.o] Error 1
    make[1]: Leaving directory '/root/TensorRT/build/CMakeFiles/CMakeTmp'
    make: *** [Makefile:121: cmTC_230cd/fast] Error 2  
  CMake will not be able to correctly generate this project.
Call Stack (most recent call first):
  CMakeLists.txt:48 (project) 
-- Configuring incomplete, errors occurred!
See also "/root/TensorRT/build/CMakeFiles/CMakeOutput.log".
See also "/root/TensorRT/build/CMakeFiles/CMakeError.log"has context menu

root@nvmbdprp023229:~/TensorRT/build# /usr/local/bin/cmake … -DGPU_ARCHS=86 -DTRT_LIB_DIR=/usr/lib/x86_64-linux-gnu/ -DCMAKE_C_COMPILER=/usr/bin/gcc -DTRT_BIN_DIR=pwd/out
Building for TensorRT version: 8.5.3, library version: 8
– The CXX compiler identification is unknown
– The CUDA compiler identification is unknown
– Check for working CXX compiler: /usr/local/bin/g++
– Check for working CXX compiler: /usr/local/bin/g++ – broken
CMake Error at /usr/local/share/cmake-3.13/Modules/CMakeTestCXXCompiler.cmake:45 (message):
The C++ compiler “/usr/local/bin/g++” is not able to compile a simple test program. It fails with the following output: Change Dir: /root/TensorRT/build/CMakeFiles/CMakeTmp Run Build Command:“/usr/local/bin/make” “cmTC_230cd/fast”
/usr/local/bin/make -f CMakeFiles/cmTC_230cd.dir/build.make CMakeFiles/cmTC_230cd.dir/build
make[1]: Entering directory ‘/root/TensorRT/build/CMakeFiles/CMakeTmp’
Building CXX object CMakeFiles/cmTC_230cd.dir/testCXXCompiler.cxx.o
/usr/local/bin/g++ -o CMakeFiles/cmTC_230cd.dir/testCXXCompiler.cxx.o -c /root/TensorRT/build/CMakeFiles/CMakeTmp/testCXXCompiler.cxx
g++: fatal error: cannot execute ‘cc1plus’: execvp: No such file or directory
compilation terminated.
make[1]: *** [CMakeFiles/cmTC_230cd.dir/build.make:66: CMakeFiles/cmTC_230cd.dir/testCXXCompiler.cxx.o] Error 1
make[1]: Leaving directory ‘/root/TensorRT/build/CMakeFiles/CMakeTmp’
make: *** [Makefile:121: cmTC_230cd/fast] Error 2
CMake will not be able to correctly generate this project.
Call Stack (most recent call first):
CMakeLists.txt:48 (project)
– Configuring incomplete, errors occurred!
See also “/root/TensorRT/build/CMakeFiles/CMakeOutput.log”.
See also "/root/TensorRT/build/CMakeFiles/CMakeError.log"has context menu

Inside Deepstream 6.1.1 container we tried generating engine file for Yolov4. libnvinfer_plugin.so was generated successfully but while creating fp32 engine file we get following error

[INFO] [MemUsageChange] Init CUDA: CPU +326, GPU +0, now: CPU 338, GPU 399 (MiB)
[INFO] [MemUsageChange] Init builder kernel library: CPU +442, GPU +118, now: CPU 834, GPU 517 (MiB)
[WARNING] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
[INFO] ----------------------------------------------------------------
[INFO] Input filename: /tmp/file13lbBF
[INFO] ONNX IR version: 0.0.0
[INFO] Opset version: 0
[INFO] Producer name:
[INFO] Producer version:
[INFO] Domain:
[INFO] Model version: 0
[INFO] Doc string:
[INFO] ----------------------------------------------------------------
[ERROR] Number of optimization profiles does not match model input node number.
Aborted (core dumped)has context menuComposeParagraph

libnvinfer_plugin.so was generated successfully but while creating int8 engine file we get following error

[INFO] [MemUsageChange] Init CUDA: CPU +473, GPU +0, now: CPU 484, GPU 467 (MiB)
[INFO] [MemUsageSnapshot] Begin constructing builder kernel library: CPU 484 MiB, GPU 467 MiB
[INFO] [MemUsageSnapshot] End constructing builder kernel library: CPU 638 MiB, GPU 509 MiB
[libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format onnx2trt_onnx.ModelProto: 1:1: Interpreting non ascii codepoint 143.
[libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format onnx2trt_onnx.ModelProto: 1:1: Expected identifier, got: 
[ERROR] ModelImporter.cpp:735: Failed to parse ONNX model from file: /tmp/filepuyWXu
[ERROR] Failed to parse the model, please check the encoding key to make sure it's correct
[ERROR] Number of optimization profiles does not match model input node number.has context menu

Used Tenssorrt container also nv-tensorrt-local-repo-ubuntu2004-8.5.2-cuda-11.8_1.0-1_amd64.deb

Got the same error as point number 4

We tried doing it using Tao deploy which gives error while installing tao -deploy package

ERROR: Failed building wheel for mpi4py
  Building wheel for seaborn (setup.py) ... done
  Created wheel for seaborn: filename=seaborn-0.7.1-py3-none-any.whl size=165926 sha256=ebe0c86ed1cb6a16fb94512dc99ba731b97e206383054481db4765ac1bbabe57
  Stored in directory: /tmp/pip-ephem-wheel-cache-m0og5kpf/wheels/97/14/28/123fdaafb903da8a74ad19826a2e800903d5ad8c5fd3be68e1
  Building wheel for antlr4-python3-runtime (setup.py) ... done
  Created wheel for antlr4-python3-runtime: filename=antlr4_python3_runtime-4.9.3-py3-none-any.whl size=144573 sha256=32e1ef6f0c7b7e810dfb4ae1b9ead740cdd57986345c051bfcecca4e812f3404
  Stored in directory: /tmp/pip-ephem-wheel-cache-m0og5kpf/wheels/b1/a3/c2/6df046c09459b73cc9bb6c4401b0be6c47048baf9a1617c485
Successfully built pycocotools-fix seaborn antlr4-python3-runtime
Failed to build mpi4py
ERROR: Could not build wheels for mpi4py which use PEP 517 and cannot be installed directly

Morganh · March 16, 2023, 8:04am

Can you share your command?

pallavi.halarnkar · March 16, 2023, 10:28am

./tao-converter -k $KEY  \
    -p Input,1x3x544x960,8x3x544x960,16x3x544x960 \
    -d 3,544,960 \
    -o BatchedNMS \
    -e ./trt.engine \
    -t fp32 \
    ./yolov4_resnet18_fp32_epoch_080.etlt

Morganh · March 16, 2023, 12:01pm

Please set correct $key. You can set it explictly.

pallavi.halarnkar · March 16, 2023, 12:08pm

We did set the key prior to using the variable key in this command. Still the same error

Morganh · March 16, 2023, 1:46pm

Where did you download tao-converter?

pallavi.halarnkar · March 17, 2023, 8:26am

We used the following link to download tao-converter

https://developer.nvidia.com/tao-converter-82

Morganh · March 17, 2023, 8:28am

Please try TAO Converter | NVIDIA NGC
wget --content-disposition ‘https://api.ngc.nvidia.com/v2/resources/nvidia/tao/tao-converter/versions/v3.22.05_trt8.2_x86/files/tao-converter’

pallavi.halarnkar · March 17, 2023, 9:03am

We used the link to download the tao converter following is the error

root@79a02620db56:/opt/nvidia/deepstream/deepstream-6.1/sources/objectDetector_Yolo/nvdsinfer_custom_impl_Yolo/TensorRT/build# ./tao-converter -k $KEY  \>  -p Input,1x3x544x960,8x3x544x960,16x3x544x960 \>  -d 3,544,960 \>  -o BatchedNMS \>  -e ./trt.engine \>  -t fp16 \>  ./yolov4_resnet18_epoch_80.etlt
[INFO] [MemUsageChange] Init CUDA: CPU +473, GPU +0, now: CPU 484, GPU 2550 (MiB)
[INFO] [MemUsageSnapshot] Begin constructing builder kernel library: CPU 484 MiB, GPU 2550 MiB
[INFO] [MemUsageSnapshot] End constructing builder kernel library: CPU 638 MiB, GPU 2592 MiB
[libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format onnx2trt_onnx.ModelProto: 1:1: Interpreting non ascii codepoint 143.
[libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format onnx2trt_onnx.ModelProto: 1:1: Expected identifier, got: ▒
[ERROR] ModelImporter.cpp:735: Failed to parse ONNX model from file: /tmp/filem70tHL
[ERROR] Failed to parse the model, please check the encoding key to make sure it's correct
[ERROR] Number of optimization profiles does not match model input node number.
Aborted (core dumped)

Morganh · March 17, 2023, 9:05am

pallavi.halarnkar:

[libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format onnx2trt_onnx.ModelProto: 1:1: Interpreting non ascii codepoint 143.
[libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format onnx2trt_onnx.ModelProto: 1:1: Expected identifier, got: ▒

Your command has some unexpected characters. Please double check.

pallavi.halarnkar · March 17, 2023, 9:28am

Command used

 ./tao-converter -e trt.engine -p Input,1x3x544x960,8x3x544x960,16x3x544x960 -t fp16 -k ZGNpYXZ0NHE1czFmbDBlcGR0Z2RzOHJqcWw6NGZjMjUwMDMtN2QyNC00MzYzLTlhZDctOTA1MDM3YTUwYTMy -m 1 yolov4_resnet18_epoch_fp16.etlt

Output with error

[INFO] [MemUsageChange] Init CUDA: CPU +473, GPU +0, now: CPU 484, GPU 2550 (MiB)
[INFO] [MemUsageSnapshot] Begin constructing builder kernel library: CPU 484 MiB, GPU 2550 MiB
[INFO] [MemUsageSnapshot] End constructing builder kernel library: CPU 638 MiB, GPU 2592 MiB
[INFO] ----------------------------------------------------------------
[INFO] Input filename:   /tmp/file0LCOiC
[INFO] ONNX IR version:  0.0.8
[INFO] Opset version:    15
[INFO] Producer name:
[INFO] Producer version:
[INFO] Domain:
[INFO] Model version:    0
[INFO] Doc string:
[INFO] ----------------------------------------------------------------
[WARNING] onnx2trt_utils.cpp:366: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[ERROR] ModelImporter.cpp:773: While parsing node number 78 [Reshape -> "bg_reshape/Reshape:0"]:
[ERROR] ModelImporter.cpp:774: --- Begin node ---
[ERROR] ModelImporter.cpp:775: input: "bg_permute/transpose:0"
input: "shape_tensor3"
output: "bg_reshape/Reshape:0"
name: "bg_reshape"
op_type: "Reshape" [ERROR] ModelImporter.cpp:776: --- End node ---
[ERROR] ModelImporter.cpp:779: ERROR: ModelImporter.cpp:162 In function parseGraph:
[6] Invalid Node - bg_reshape
Attribute not found: allowzero
Invalid Node - bg_reshape
Attribute not found: allowzero
[ERROR] Failed to parse the model, please check the encoding key to make sure it's correct
[INFO] Detected input dimensions from the model: (-1, 3, 544, 960)
[INFO] Model has dynamic shape. Setting up optimization profiles.
[INFO] Using optimization profile min shape: (1, 3, 544, 960) for input: Input
[INFO] Using optimization profile opt shape: (8, 3, 544, 960) for input: Input
[INFO] Using optimization profile max shape: (16, 3, 544, 960) for input: Input
[ERROR] 4: [network.cpp::validate::2633] Error Code 4: Internal Error (Network must have at least one output)
[ERROR] Unable to create engine
Segmentation fault (core dumped)

Morganh · March 17, 2023, 10:44am

Please export a new etlt model by adding --target_opset .

--target_opset 12

pallavi.halarnkar · March 20, 2023, 7:28am

We have tried using the suggestion but we get the following error

Following is the command used to generate a new etlt model

!tao yolo_v4 export -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/yolov4_resnet18_epoch_$EPOCH.tlt \
                    -k $KEY \
                    -o $USER_EXPERIMENT_DIR/export/yolov4_resnet18_epoch_$EPOCH.etlt \
                    -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt \
                    --batch_size 8 \
                    --data_type fp16 \
                    --gen_ds_config \
                    --target_opset 12

Following is the error

2023-03-20 12:56:15,221 [INFO] root: Registry: ['nvcr.io']
2023-03-20 12:56:15,655 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.21.11-tf1.15.5-py3
Using TensorFlow backend.
usage: yolo_v4 export [-h] [--num_processes NUM_PROCESSES] [--gpus GPUS]
                      [--gpu_index GPU_INDEX [GPU_INDEX ...]] [--use_amp]
                      [--log_file LOG_FILE] -m MODEL -k KEY [-o OUTPUT_FILE]
                      [--force_ptq] [--cal_data_file CAL_DATA_FILE]
                      [--cal_image_dir CAL_IMAGE_DIR]
                      [--data_type {fp32,fp16,int8}] [-s] [--gen_ds_config]
                      [--cal_cache_file CAL_CACHE_FILE] [--batches BATCHES]
                      [--max_workspace_size MAX_WORKSPACE_SIZE]
                      [--max_batch_size MAX_BATCH_SIZE]
                      [--batch_size BATCH_SIZE]
                      [--min_batch_size MIN_BATCH_SIZE]
                      [--opt_batch_size OPT_BATCH_SIZE] [-e EXPERIMENT_SPEC]
                      [--engine_file ENGINE_FILE]
                      [--static_batch_size STATIC_BATCH_SIZE]
                      [--results_dir RESULTS_DIR] [-v]
                      {dataset_convert,evaluate,export,inference,kmeans,prune,train}
                      ...
yolo_v4 export: error: invalid choice: '12' (choose from 'dataset_convert', 'evaluate', 'export', 'inference', 'kmeans', 'prune', 'train')
2023-03-20 12:56:29,880 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

Morganh · March 21, 2023, 2:20am

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

Can you double check?
Or can you open a terminal instead as below and run following commands?
$ tao yolo_v4 run /bin/bash
then inside the docker,

# yolo_v4 export -m xxx.tlt -k nvidia_tlt -o xxx.etlt -e spec.txt --target_opset 12

system · April 10, 2023, 2:20pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Error when generating engine file from a TAO trained yolov4_tiny model in Deepstream 6.1.1 DeepStream SDK	11	557	June 12, 2023
Yolov4 not working in deepstream app? TAO Toolkit	26	1596	August 28, 2021
Unable to deploy TAO 4.0.1 yolov4 model on deepstream6.0 TAO Toolkit deepstream	43	1523	August 18, 2023
Error in Yolov4 engine conversion, TAO Toolkit	43	2799	October 26, 2021
TAO Toolkit Yolov4 DeepStream SDK	20	669	November 30, 2022
Deploying Custom Trained Yolov4 model on Deepstream 6.2 sdk DeepStream SDK	21	1254	March 17, 2023
Tao: error: invalid choice: 'tlt-converter' TAO Toolkit	6	1134	November 23, 2021
[ERROR] Model has dynamic shape but no optimization profile specified. Aborted (core dumped) TAO Toolkit	30	2410	December 13, 2021
Tao Deploying to DeepStream for YOLOv4-tiny TAO Toolkit	6	769	August 25, 2023
Convert TAO Yolov4 model to DLA engine fails TAO Toolkit	22	1896	March 1, 2022

Error in integrating Yolov4 in Deepstream 6, 6.1, 6.1.1, and 6.2

Related topics