Error in integrating Yolov4 in Deepstream 6, 6.1, 6.1.1, and 6.2

• Hardware : A10 GPU
• Network Type : Yolov4

Cuda version used: 11.6
Tensorrt : 8.2.5.1
CudDNN: cudnn-local-repo-ubuntu2004-8.6.0.163_1.0-1_amd64.deb

Following are details of all the options we have tried in integrating Yolov4 model in the above mentioned Deepstream versions, However we have been facing one or the other error.

  1. Trained Yolov4 on custom dataset using Tao 3.21.08.

  2. Exported model using tao-converter in two ways

    a. Using tao converter from tao toolkit 3.21.08 converted model failed to deploy
    b. using Tao-converter-x86-tensorrt8.2 converted model failed

  3. libnvinfer_plugin.so creation failed outside container, as cmake gives the following error

 root@nvmbdprp023229:~/TensorRT/build# /usr/local/bin/cmake .. -DGPU_ARCHS=86  -DTRT_LIB_DIR=/usr/lib/x86_64-linux-gnu/ -DCMAKE_C_COMPILER=/usr/bin/gcc -DTRT_BIN_DIR=`pwd`/out
Building for TensorRT version: 8.5.3, library version: 8
-- The CXX compiler identification is unknown
-- The CUDA compiler identification is unknown
-- Check for working CXX compiler: /usr/local/bin/g++
-- Check for working CXX compiler: /usr/local/bin/g++ -- broken
CMake Error at /usr/local/share/cmake-3.13/Modules/CMakeTestCXXCompiler.cmake:45 (message):
  The C++ compiler     "/usr/local/bin/g++"   is not able to compile a simple test program.   It fails with the following output:     Change Dir: /root/TensorRT/build/CMakeFiles/CMakeTmp     Run Build Command:"/usr/local/bin/make" "cmTC_230cd/fast"
    /usr/local/bin/make -f CMakeFiles/cmTC_230cd.dir/build.make CMakeFiles/cmTC_230cd.dir/build
    make[1]: Entering directory '/root/TensorRT/build/CMakeFiles/CMakeTmp'
    Building CXX object CMakeFiles/cmTC_230cd.dir/testCXXCompiler.cxx.o
    /usr/local/bin/g++     -o CMakeFiles/cmTC_230cd.dir/testCXXCompiler.cxx.o -c /root/TensorRT/build/CMakeFiles/CMakeTmp/testCXXCompiler.cxx
    g++: fatal error: cannot execute ‘cc1plus’: execvp: No such file or directory
    compilation terminated.
    make[1]: *** [CMakeFiles/cmTC_230cd.dir/build.make:66: CMakeFiles/cmTC_230cd.dir/testCXXCompiler.cxx.o] Error 1
    make[1]: Leaving directory '/root/TensorRT/build/CMakeFiles/CMakeTmp'
    make: *** [Makefile:121: cmTC_230cd/fast] Error 2  
  CMake will not be able to correctly generate this project.
Call Stack (most recent call first):
  CMakeLists.txt:48 (project) 
-- Configuring incomplete, errors occurred!
See also "/root/TensorRT/build/CMakeFiles/CMakeOutput.log".
See also "/root/TensorRT/build/CMakeFiles/CMakeError.log"has context menu


root@nvmbdprp023229:~/TensorRT/build# /usr/local/bin/cmake … -DGPU_ARCHS=86 -DTRT_LIB_DIR=/usr/lib/x86_64-linux-gnu/ -DCMAKE_C_COMPILER=/usr/bin/gcc -DTRT_BIN_DIR=pwd/out
Building for TensorRT version: 8.5.3, library version: 8
– The CXX compiler identification is unknown
– The CUDA compiler identification is unknown
– Check for working CXX compiler: /usr/local/bin/g++
– Check for working CXX compiler: /usr/local/bin/g++ – broken
CMake Error at /usr/local/share/cmake-3.13/Modules/CMakeTestCXXCompiler.cmake:45 (message):
The C++ compiler “/usr/local/bin/g++” is not able to compile a simple test program. It fails with the following output: Change Dir: /root/TensorRT/build/CMakeFiles/CMakeTmp Run Build Command:“/usr/local/bin/make” “cmTC_230cd/fast”
/usr/local/bin/make -f CMakeFiles/cmTC_230cd.dir/build.make CMakeFiles/cmTC_230cd.dir/build
make[1]: Entering directory ‘/root/TensorRT/build/CMakeFiles/CMakeTmp’
Building CXX object CMakeFiles/cmTC_230cd.dir/testCXXCompiler.cxx.o
/usr/local/bin/g++ -o CMakeFiles/cmTC_230cd.dir/testCXXCompiler.cxx.o -c /root/TensorRT/build/CMakeFiles/CMakeTmp/testCXXCompiler.cxx
g++: fatal error: cannot execute ‘cc1plus’: execvp: No such file or directory
compilation terminated.
make[1]: *** [CMakeFiles/cmTC_230cd.dir/build.make:66: CMakeFiles/cmTC_230cd.dir/testCXXCompiler.cxx.o] Error 1
make[1]: Leaving directory ‘/root/TensorRT/build/CMakeFiles/CMakeTmp’
make: *** [Makefile:121: cmTC_230cd/fast] Error 2
CMake will not be able to correctly generate this project.
Call Stack (most recent call first):
CMakeLists.txt:48 (project)
– Configuring incomplete, errors occurred!
See also “/root/TensorRT/build/CMakeFiles/CMakeOutput.log”.
See also "/root/TensorRT/build/CMakeFiles/CMakeError.log"has context menu

  1. Inside Deepstream 6.1.1 container we tried generating engine file for Yolov4. libnvinfer_plugin.so was generated successfully but while creating fp32 engine file we get following error
[INFO] [MemUsageChange] Init CUDA: CPU +326, GPU +0, now: CPU 338, GPU 399 (MiB)
[INFO] [MemUsageChange] Init builder kernel library: CPU +442, GPU +118, now: CPU 834, GPU 517 (MiB)
[WARNING] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
[INFO] ----------------------------------------------------------------
[INFO] Input filename: /tmp/file13lbBF
[INFO] ONNX IR version: 0.0.0
[INFO] Opset version: 0
[INFO] Producer name:
[INFO] Producer version:
[INFO] Domain:
[INFO] Model version: 0
[INFO] Doc string:
[INFO] ----------------------------------------------------------------
[ERROR] Number of optimization profiles does not match model input node number.
Aborted (core dumped)has context menuComposeParagraph

libnvinfer_plugin.so was generated successfully but while creating int8 engine file we get following error

[INFO] [MemUsageChange] Init CUDA: CPU +473, GPU +0, now: CPU 484, GPU 467 (MiB)
[INFO] [MemUsageSnapshot] Begin constructing builder kernel library: CPU 484 MiB, GPU 467 MiB
[INFO] [MemUsageSnapshot] End constructing builder kernel library: CPU 638 MiB, GPU 509 MiB
[libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format onnx2trt_onnx.ModelProto: 1:1: Interpreting non ascii codepoint 143.
[libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format onnx2trt_onnx.ModelProto: 1:1: Expected identifier, got: 
[ERROR] ModelImporter.cpp:735: Failed to parse ONNX model from file: /tmp/filepuyWXu
[ERROR] Failed to parse the model, please check the encoding key to make sure it's correct
[ERROR] Number of optimization profiles does not match model input node number.has context menu

  1. Used Tenssorrt container also nv-tensorrt-local-repo-ubuntu2004-8.5.2-cuda-11.8_1.0-1_amd64.deb
Got the same error as point number 4 
  1. We tried doing it using Tao deploy which gives error while installing tao -deploy package
ERROR: Failed building wheel for mpi4py
  Building wheel for seaborn (setup.py) ... done
  Created wheel for seaborn: filename=seaborn-0.7.1-py3-none-any.whl size=165926 sha256=ebe0c86ed1cb6a16fb94512dc99ba731b97e206383054481db4765ac1bbabe57
  Stored in directory: /tmp/pip-ephem-wheel-cache-m0og5kpf/wheels/97/14/28/123fdaafb903da8a74ad19826a2e800903d5ad8c5fd3be68e1
  Building wheel for antlr4-python3-runtime (setup.py) ... done
  Created wheel for antlr4-python3-runtime: filename=antlr4_python3_runtime-4.9.3-py3-none-any.whl size=144573 sha256=32e1ef6f0c7b7e810dfb4ae1b9ead740cdd57986345c051bfcecca4e812f3404
  Stored in directory: /tmp/pip-ephem-wheel-cache-m0og5kpf/wheels/b1/a3/c2/6df046c09459b73cc9bb6c4401b0be6c47048baf9a1617c485
Successfully built pycocotools-fix seaborn antlr4-python3-runtime
Failed to build mpi4py
ERROR: Could not build wheels for mpi4py which use PEP 517 and cannot be installed directly 

Can you share your command?

./tao-converter -k $KEY  \
    -p Input,1x3x544x960,8x3x544x960,16x3x544x960 \
    -d 3,544,960 \
    -o BatchedNMS \
    -e ./trt.engine \
    -t fp32 \
    ./yolov4_resnet18_fp32_epoch_080.etlt

Please set correct $key. You can set it explictly.

We did set the key prior to using the variable key in this command. Still the same error

Where did you download tao-converter?

We used the following link to download tao-converter

https://developer.nvidia.com/tao-converter-82

Please try TAO Converter | NVIDIA NGC
wget --content-disposition ‘https://api.ngc.nvidia.com/v2/resources/nvidia/tao/tao-converter/versions/v3.22.05_trt8.2_x86/files/tao-converter

We used the link to download the tao converter following is the error

root@79a02620db56:/opt/nvidia/deepstream/deepstream-6.1/sources/objectDetector_Yolo/nvdsinfer_custom_impl_Yolo/TensorRT/build# ./tao-converter -k $KEY  \>  -p Input,1x3x544x960,8x3x544x960,16x3x544x960 \>  -d 3,544,960 \>  -o BatchedNMS \>  -e ./trt.engine \>  -t fp16 \>  ./yolov4_resnet18_epoch_80.etlt
[INFO] [MemUsageChange] Init CUDA: CPU +473, GPU +0, now: CPU 484, GPU 2550 (MiB)
[INFO] [MemUsageSnapshot] Begin constructing builder kernel library: CPU 484 MiB, GPU 2550 MiB
[INFO] [MemUsageSnapshot] End constructing builder kernel library: CPU 638 MiB, GPU 2592 MiB
[libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format onnx2trt_onnx.ModelProto: 1:1: Interpreting non ascii codepoint 143.
[libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format onnx2trt_onnx.ModelProto: 1:1: Expected identifier, got: ▒
[ERROR] ModelImporter.cpp:735: Failed to parse ONNX model from file: /tmp/filem70tHL
[ERROR] Failed to parse the model, please check the encoding key to make sure it's correct
[ERROR] Number of optimization profiles does not match model input node number.
Aborted (core dumped)

Your command has some unexpected characters. Please double check.

Command used

 ./tao-converter -e trt.engine -p Input,1x3x544x960,8x3x544x960,16x3x544x960 -t fp16 -k ZGNpYXZ0NHE1czFmbDBlcGR0Z2RzOHJqcWw6NGZjMjUwMDMtN2QyNC00MzYzLTlhZDctOTA1MDM3YTUwYTMy -m 1 yolov4_resnet18_epoch_fp16.etlt

Output with error

[INFO] [MemUsageChange] Init CUDA: CPU +473, GPU +0, now: CPU 484, GPU 2550 (MiB)
[INFO] [MemUsageSnapshot] Begin constructing builder kernel library: CPU 484 MiB, GPU 2550 MiB
[INFO] [MemUsageSnapshot] End constructing builder kernel library: CPU 638 MiB, GPU 2592 MiB
[INFO] ----------------------------------------------------------------
[INFO] Input filename:   /tmp/file0LCOiC
[INFO] ONNX IR version:  0.0.8
[INFO] Opset version:    15
[INFO] Producer name:
[INFO] Producer version:
[INFO] Domain:
[INFO] Model version:    0
[INFO] Doc string:
[INFO] ----------------------------------------------------------------
[WARNING] onnx2trt_utils.cpp:366: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[ERROR] ModelImporter.cpp:773: While parsing node number 78 [Reshape -> "bg_reshape/Reshape:0"]:
[ERROR] ModelImporter.cpp:774: --- Begin node ---
[ERROR] ModelImporter.cpp:775: input: "bg_permute/transpose:0"
input: "shape_tensor3"
output: "bg_reshape/Reshape:0"
name: "bg_reshape"
op_type: "Reshape" [ERROR] ModelImporter.cpp:776: --- End node ---
[ERROR] ModelImporter.cpp:779: ERROR: ModelImporter.cpp:162 In function parseGraph:
[6] Invalid Node - bg_reshape
Attribute not found: allowzero
Invalid Node - bg_reshape
Attribute not found: allowzero
[ERROR] Failed to parse the model, please check the encoding key to make sure it's correct
[INFO] Detected input dimensions from the model: (-1, 3, 544, 960)
[INFO] Model has dynamic shape. Setting up optimization profiles.
[INFO] Using optimization profile min shape: (1, 3, 544, 960) for input: Input
[INFO] Using optimization profile opt shape: (8, 3, 544, 960) for input: Input
[INFO] Using optimization profile max shape: (16, 3, 544, 960) for input: Input
[ERROR] 4: [network.cpp::validate::2633] Error Code 4: Internal Error (Network must have at least one output)
[ERROR] Unable to create engine
Segmentation fault (core dumped)

Please export a new etlt model by adding --target_opset .

--target_opset 12

We have tried using the suggestion but we get the following error

Following is the command used to generate a new etlt model

!tao yolo_v4 export -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/yolov4_resnet18_epoch_$EPOCH.tlt \
                    -k $KEY \
                    -o $USER_EXPERIMENT_DIR/export/yolov4_resnet18_epoch_$EPOCH.etlt \
                    -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt \
                    --batch_size 8 \
                    --data_type fp16 \
                    --gen_ds_config \
                    --target_opset 12

Following is the error

2023-03-20 12:56:15,221 [INFO] root: Registry: ['nvcr.io']
2023-03-20 12:56:15,655 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.21.11-tf1.15.5-py3
Using TensorFlow backend.
usage: yolo_v4 export [-h] [--num_processes NUM_PROCESSES] [--gpus GPUS]
                      [--gpu_index GPU_INDEX [GPU_INDEX ...]] [--use_amp]
                      [--log_file LOG_FILE] -m MODEL -k KEY [-o OUTPUT_FILE]
                      [--force_ptq] [--cal_data_file CAL_DATA_FILE]
                      [--cal_image_dir CAL_IMAGE_DIR]
                      [--data_type {fp32,fp16,int8}] [-s] [--gen_ds_config]
                      [--cal_cache_file CAL_CACHE_FILE] [--batches BATCHES]
                      [--max_workspace_size MAX_WORKSPACE_SIZE]
                      [--max_batch_size MAX_BATCH_SIZE]
                      [--batch_size BATCH_SIZE]
                      [--min_batch_size MIN_BATCH_SIZE]
                      [--opt_batch_size OPT_BATCH_SIZE] [-e EXPERIMENT_SPEC]
                      [--engine_file ENGINE_FILE]
                      [--static_batch_size STATIC_BATCH_SIZE]
                      [--results_dir RESULTS_DIR] [-v]
                      {dataset_convert,evaluate,export,inference,kmeans,prune,train}
                      ...
yolo_v4 export: error: invalid choice: '12' (choose from 'dataset_convert', 'evaluate', 'export', 'inference', 'kmeans', 'prune', 'train')
2023-03-20 12:56:29,880 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

Can you double check?
Or can you open a terminal instead as below and run following commands?
$ tao yolo_v4 run /bin/bash
then inside the docker,

# yolo_v4 export -m xxx.tlt -k nvidia_tlt -o xxx.etlt -e spec.txt --target_opset 12

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.