Errors during Detectnet_v2 inference on DeepStream

Please provide the following information when requesting support.

• Hardware (T4/V100/Xavier/Nano/etc) : T1000
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc) : Detectnet_v2
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)
• Training spec file(If have, please share here)
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)

After training and configuring Detectnet_v2 in the DeepStream pipeline, the following error returns:

Remembering that to generate the .engine file it was followed according to its documentation, first I exported it to generate .onnx, then during the generation of the tensorRT engine I calibrated it to int8, but now it returns that the tensorRT that I used to create the file is in a different version than in my local environment, all the training, export and generation using “tao model detectnet_v2 …” was carried out within docker, which from what I read is in a previous version of tensorRT, I already tried to solve this using trtexec, but could not be resolved

There is an easier way. You need not generate TensorRT engine. You can just set the onnx file into the spec file and let deepstream help generate tensorrt engine. This will make sure the engine version is the same between building and inference.

Good recommendation, I have already tried to have the configuration file generate the tensorRT engine, but it returns that the calibration file (either .cache or .bin) obtained with the command “tao model detectnet_v2 calibration …” is not compatible with the tensorRT version that I have installed, in this case what would be the path? Can I create an int8 calibration file outside the docker environment?

Furthermore, I noticed that the training was ending correctly, but with the mAP (precision) parameter almost 0.0%, I have 2 suspicions:

  1. The .hdf5 file generated during training is not trained correctly and this is leading to incorrect calibration directly in the DeepStream installed on the host (outside the Docker environment),

  2. The training configuration file .txt is misconfigured and generates low mAP,

I am attaching the .txt files resulting from the conversion of my coco dataset to binary, there is also the training log and configuration file for training,

log_dataset_conver.txt (27.8 KB)
detectnet_v2_treinamento_fragmaq_spec.txt (9.3 KB)
log_train.txt (452.0 KB)

If you notice an error, let me know so I can correct it.

Please create a new topic for training issue. Also please mention that the resolution of your training images. Are they the same? If not , please set enable_auto_resize: true.

There is not this command. Please refer to tao_tutorials/notebooks/tao_launcher_starter_kit/detectnet_v2/detectnet_v2.ipynb at main · NVIDIA/tao_tutorials · GitHub to generate calibration file.

  1. Still suggest you to use your sever to generate the calibration cache file.
    For example, you can pull TAO deploy 5.0 docker(Its TRT version is 8.5.3).
    5.0 deploy docker: nvcr.io/nvidia/tao/tao-toolkit:5.0.0-deploy /bin/bash
    Other version of deploy docker can be found in TAO Toolkit | NVIDIA NGC.

  2. For using different TensorRT version,
    You can go to https://developer.nvidia.com/nvidia-tensorrt-8x-download and find the expected version of tar.gz file for TensorRT.

For example, for 8.6.1.6, run something inside the tao docker.

$ wget https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/secure/8.6.1/tars/TensorRT-8.6.1.6.Linux.x86_64-gnu.cuda-11.8.tar.gz
$ tar zxvf TensorRT-8.6.1.6.Linux.x86_64-gnu.cuda-11.8.tar.gz
$ pip install TensorRT-8.6.1.6/python/tensorrt-8.6.1-cp38-none-linux_x86_64.whl
$ export LD_LIBRARY_PATH=/home/morganh/TensorRT-8.6.1.6/lib:$LD_LIBRARY_PATH

  1. BTW, If you want to generate calibration cache file in Jetson, you can follow tao_deploy/README.md at main · NVIDIA/tao_deploy · GitHub to install tao-deploy in Jetson.
    For example, Jetpack5.0 + nvcr.io/nvidia/l4t-tensorrt:r8.5.2.2-devel should be working.
    You can download other version of l4t-tensorrt docker.
    But please do not flash Jetpack6.0 to Jetson due to this thread.

Reading the documentation I saw that the correct command is tao model detectnet_v2 gen_trt_engine, but when I generate the calibration file and try to run it in the docker environment where I have tensorRT version 10.4 installed, it returns that this file is not compatible because it was generated in a previous version, in tensor version 8.6 (it was installed together with docker for Tao Toolkit), just to add, I am not working on a jetson board, but on a notebook with a T1000 GPU, That’s why I have version 10.4 installed

You can refer to my above-mentioned steps(need to change to 10.4, My steps are just a reference) to install your expected 10.4 version inside the TAO docker to generate the calibration file.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.