• Hardware : A10 GPU
• Network Type : Yolov4
Cuda version used: 11.6
Tensorrt : 8.2.5.1
CudDNN: cudnn-local-repo-ubuntu2004-8.6.0.163_1.0-1_amd64.deb
Following are details of all the options we have tried in integrating Yolov4 model in the above mentioned Deepstream versions, However we have been facing one or the other error.
-
Trained Yolov4 on custom dataset using Tao 3.21.08.
-
Exported model using tao-converter in two ways
a. Using tao converter from tao toolkit 3.21.08 converted model failed to deploy
b. using Tao-converter-x86-tensorrt8.2 converted model failed -
libnvinfer_plugin.so creation failed outside container, as cmake gives the following error
root@nvmbdprp023229:~/TensorRT/build# /usr/local/bin/cmake .. -DGPU_ARCHS=86 -DTRT_LIB_DIR=/usr/lib/x86_64-linux-gnu/ -DCMAKE_C_COMPILER=/usr/bin/gcc -DTRT_BIN_DIR=`pwd`/out
Building for TensorRT version: 8.5.3, library version: 8
-- The CXX compiler identification is unknown
-- The CUDA compiler identification is unknown
-- Check for working CXX compiler: /usr/local/bin/g++
-- Check for working CXX compiler: /usr/local/bin/g++ -- broken
CMake Error at /usr/local/share/cmake-3.13/Modules/CMakeTestCXXCompiler.cmake:45 (message):
The C++ compiler "/usr/local/bin/g++" is not able to compile a simple test program. It fails with the following output: Change Dir: /root/TensorRT/build/CMakeFiles/CMakeTmp Run Build Command:"/usr/local/bin/make" "cmTC_230cd/fast"
/usr/local/bin/make -f CMakeFiles/cmTC_230cd.dir/build.make CMakeFiles/cmTC_230cd.dir/build
make[1]: Entering directory '/root/TensorRT/build/CMakeFiles/CMakeTmp'
Building CXX object CMakeFiles/cmTC_230cd.dir/testCXXCompiler.cxx.o
/usr/local/bin/g++ -o CMakeFiles/cmTC_230cd.dir/testCXXCompiler.cxx.o -c /root/TensorRT/build/CMakeFiles/CMakeTmp/testCXXCompiler.cxx
g++: fatal error: cannot execute ‘cc1plus’: execvp: No such file or directory
compilation terminated.
make[1]: *** [CMakeFiles/cmTC_230cd.dir/build.make:66: CMakeFiles/cmTC_230cd.dir/testCXXCompiler.cxx.o] Error 1
make[1]: Leaving directory '/root/TensorRT/build/CMakeFiles/CMakeTmp'
make: *** [Makefile:121: cmTC_230cd/fast] Error 2
CMake will not be able to correctly generate this project.
Call Stack (most recent call first):
CMakeLists.txt:48 (project)
-- Configuring incomplete, errors occurred!
See also "/root/TensorRT/build/CMakeFiles/CMakeOutput.log".
See also "/root/TensorRT/build/CMakeFiles/CMakeError.log"has context menu
root@nvmbdprp023229:~/TensorRT/build# /usr/local/bin/cmake … -DGPU_ARCHS=86 -DTRT_LIB_DIR=/usr/lib/x86_64-linux-gnu/ -DCMAKE_C_COMPILER=/usr/bin/gcc -DTRT_BIN_DIR=pwd
/out
Building for TensorRT version: 8.5.3, library version: 8
– The CXX compiler identification is unknown
– The CUDA compiler identification is unknown
– Check for working CXX compiler: /usr/local/bin/g++
– Check for working CXX compiler: /usr/local/bin/g++ – broken
CMake Error at /usr/local/share/cmake-3.13/Modules/CMakeTestCXXCompiler.cmake:45 (message):
The C++ compiler “/usr/local/bin/g++” is not able to compile a simple test program. It fails with the following output: Change Dir: /root/TensorRT/build/CMakeFiles/CMakeTmp Run Build Command:“/usr/local/bin/make” “cmTC_230cd/fast”
/usr/local/bin/make -f CMakeFiles/cmTC_230cd.dir/build.make CMakeFiles/cmTC_230cd.dir/build
make[1]: Entering directory ‘/root/TensorRT/build/CMakeFiles/CMakeTmp’
Building CXX object CMakeFiles/cmTC_230cd.dir/testCXXCompiler.cxx.o
/usr/local/bin/g++ -o CMakeFiles/cmTC_230cd.dir/testCXXCompiler.cxx.o -c /root/TensorRT/build/CMakeFiles/CMakeTmp/testCXXCompiler.cxx
g++: fatal error: cannot execute ‘cc1plus’: execvp: No such file or directory
compilation terminated.
make[1]: *** [CMakeFiles/cmTC_230cd.dir/build.make:66: CMakeFiles/cmTC_230cd.dir/testCXXCompiler.cxx.o] Error 1
make[1]: Leaving directory ‘/root/TensorRT/build/CMakeFiles/CMakeTmp’
make: *** [Makefile:121: cmTC_230cd/fast] Error 2
CMake will not be able to correctly generate this project.
Call Stack (most recent call first):
CMakeLists.txt:48 (project)
– Configuring incomplete, errors occurred!
See also “/root/TensorRT/build/CMakeFiles/CMakeOutput.log”.
See also "/root/TensorRT/build/CMakeFiles/CMakeError.log"has context menu
- Inside Deepstream 6.1.1 container we tried generating engine file for Yolov4. libnvinfer_plugin.so was generated successfully but while creating fp32 engine file we get following error
[INFO] [MemUsageChange] Init CUDA: CPU +326, GPU +0, now: CPU 338, GPU 399 (MiB)
[INFO] [MemUsageChange] Init builder kernel library: CPU +442, GPU +118, now: CPU 834, GPU 517 (MiB)
[WARNING] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
[INFO] ----------------------------------------------------------------
[INFO] Input filename: /tmp/file13lbBF
[INFO] ONNX IR version: 0.0.0
[INFO] Opset version: 0
[INFO] Producer name:
[INFO] Producer version:
[INFO] Domain:
[INFO] Model version: 0
[INFO] Doc string:
[INFO] ----------------------------------------------------------------
[ERROR] Number of optimization profiles does not match model input node number.
Aborted (core dumped)has context menuComposeParagraph
libnvinfer_plugin.so was generated successfully but while creating int8 engine file we get following error
[INFO] [MemUsageChange] Init CUDA: CPU +473, GPU +0, now: CPU 484, GPU 467 (MiB)
[INFO] [MemUsageSnapshot] Begin constructing builder kernel library: CPU 484 MiB, GPU 467 MiB
[INFO] [MemUsageSnapshot] End constructing builder kernel library: CPU 638 MiB, GPU 509 MiB
[libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format onnx2trt_onnx.ModelProto: 1:1: Interpreting non ascii codepoint 143.
[libprotobuf ERROR google/protobuf/text_format.cc:298] Error parsing text-format onnx2trt_onnx.ModelProto: 1:1: Expected identifier, got:
[ERROR] ModelImporter.cpp:735: Failed to parse ONNX model from file: /tmp/filepuyWXu
[ERROR] Failed to parse the model, please check the encoding key to make sure it's correct
[ERROR] Number of optimization profiles does not match model input node number.has context menu
- Used Tenssorrt container also nv-tensorrt-local-repo-ubuntu2004-8.5.2-cuda-11.8_1.0-1_amd64.deb
Got the same error as point number 4
- We tried doing it using Tao deploy which gives error while installing tao -deploy package
ERROR: Failed building wheel for mpi4py
Building wheel for seaborn (setup.py) ... done
Created wheel for seaborn: filename=seaborn-0.7.1-py3-none-any.whl size=165926 sha256=ebe0c86ed1cb6a16fb94512dc99ba731b97e206383054481db4765ac1bbabe57
Stored in directory: /tmp/pip-ephem-wheel-cache-m0og5kpf/wheels/97/14/28/123fdaafb903da8a74ad19826a2e800903d5ad8c5fd3be68e1
Building wheel for antlr4-python3-runtime (setup.py) ... done
Created wheel for antlr4-python3-runtime: filename=antlr4_python3_runtime-4.9.3-py3-none-any.whl size=144573 sha256=32e1ef6f0c7b7e810dfb4ae1b9ead740cdd57986345c051bfcecca4e812f3404
Stored in directory: /tmp/pip-ephem-wheel-cache-m0og5kpf/wheels/b1/a3/c2/6df046c09459b73cc9bb6c4401b0be6c47048baf9a1617c485
Successfully built pycocotools-fix seaborn antlr4-python3-runtime
Failed to build mpi4py
ERROR: Could not build wheels for mpi4py which use PEP 517 and cannot be installed directly