Unable to deploy TAO 4.0.1 yolov4 model on deepstream6.0

adithya.ajith · August 3, 2023, 10:47am

• Hardware Platform: GPU
• DeepStream Version: 6.0
• TensorRT Version: 8.0.1-1+cuda11.3
• NVIDIA GPU Driver Version: 470.57.02
• Issue Type: bugs

I have trained a yolo_v4 model on TAO toolkit 4.0.1 with quantization set to true and exported the model to int8 and tried deploying the model in my python application which runs on deepstream 6.0. I am getting the following error on the console.

parseModel: Failed to parse ONNX model ERROR: tlt/tlt_decode.cpp:389 Failed to build network, error in model parsing.

following is the command I used for exporting the model

tao yolo_v4 export -m /workspace/tao-experiments/output/v4/weights/yolov4_resnet18_epoch_080.tlt -o /workspace/tao-experiments/output/etlt/d17_jul24_yolov4_resnet18_epoch_080_v4.1.etlt -k nvidia_tlt --data_type int8 -e  /workspace/tao-experiments/specs/d17-jul24-yolov4-960-544-config_v4.1.txt --cal_cache_file /workspace/tao-experiments/output/etlt/d17_jul24_yolov4_resnet18_epoch_080_v4.1.bin

I am also attaching the nvinfer config
d15_yolov3_resnet18_epoch_070.conf (4.3 KB)
, training config
d17-jul24-yolov4-960-544-config_v4.1.txt (2.5 KB)
and the full console log containing the error
log20230803.txt (4.3 KB)
.

yuweiw · August 4, 2023, 2:26am

Could you open more log by referring the link below: https://forums.developer.nvidia.com/t/deepstream-sdk-faq/80236/33?

adithya.ajith · August 4, 2023, 4:48am

Following log is debug level
log20230804.txt (121.9 KB)

adithya.ajith · August 4, 2023, 6:58am

the infer-dims value I had given in the nvinfer was wrong I have updated this, this is the new nvinfer config
d15_yolov3_resnet18_epoch_070.conf (4.4 KB). The following error

still persists, this is the debug log for the updated config
20230804.log (122.2 KB)

Morganh · August 4, 2023, 7:12am

adithya.ajith:

d17_jul24_yolov4_resnet18_epoch_080_v4.1.etlt -k nvidia_tlt --data_type int8 -e  /workspace/tao-experiments/specs/d17-jul24-yolov4-960-544-config_v4.1.txt --cal_cache_file /workspace/tao-experiments/output/etlt/d17_jul24_yolov4_resnet18_epoch_080_v4.1.bin

I am also attaching the nvinfer config
d15_yolov3_resnet18_epoch_070.conf (4.3 KB)

@adithya.ajith
Could you double check the .etlt model?
You export it and its name is d17_jul24_yolov4_resnet18_epoch_080_v4.1.etlt.
Could you set it to the ds config as well?
Currently, it is

tlt-encoded-model=/opt/vast/platform/nvast/ds_vast_pipeline/d17_july24_yolov4_resnet18_epoch_080.etlt

adithya.ajith · August 4, 2023, 7:18am

The above is the file name for the model, in the logs

you can see ds loading the model layers. I had exported the model twice thinking it might have been some issue with the export command not running properly, I have shared the first export command and the model name is different the second time I ran export, sorry for the confusion caused.

Morganh · August 4, 2023, 7:37am

Please check the input width and height again.
In your training spec, it is
output_width: 960
output_height: 544

But in ds config, it is not matching.
infer-dims=3;576;1024

More, to check if .etlt model and its key are correct, suggest you to run below experiment to narrow down.
After exporting, please check if tao-deploy yolo_v4 gen_trt_engine and tao-deploy yolo_v4 inference can work. Refer to GPU-optimized AI, Machine Learning, & HPC Software | NVIDIA NGC.

adithya.ajith · August 4, 2023, 7:45am

I have fixed the infer-dims and have mentioned this in a previous reply. Please check the log shared here.

Morganh · August 4, 2023, 8:13am

OK, to narrow down, please run below experiment to check if .etlt model and its key are correct.
After exporting, please check if tao-deploy yolo_v4 gen_trt_engine and tao-deploy yolo_v4 inference can work. Refer to GPU-optimized AI, Machine Learning, & HPC Software | NVIDIA NGC.

adithya.ajith · August 4, 2023, 8:25am

Sure, let me get back to you with that.

adithya.ajith · August 4, 2023, 4:22pm

When I run the following command Iam getting the following error in the log
20230804.log (6.8 KB)

tao-deploy yolo_v4 gen_trt_engine -m /workspace/tao-experiments/output/etlt/d17_july24_yolov4_resnet18_epoch_080.etlt -k nvidia_tlt -e /workspace/tao-experiments/specs/d17-jul24-yolov4-960-544-config_v4.1.txt --data_type int8 --cal_cache_file /workspace/tao-experiments/output/etlt/d17_july24_yolov4_resnet18_epoch_080.bin --engine_file /workspace/tao-experiments/output/etlt/d17_july24_yolov4_resnet18_epoch_080.engine.int8

Morganh · August 5, 2023, 3:38pm

Moving to TAO forum for tracking.

Morganh · August 5, 2023, 5:17pm

adithya.ajith:

When I run the following command Iam getting the following error in the log
20230804.log (6.8 KB)

tao-deploy yolo_v4 gen_trt_engine -m /workspace/tao-experiments/output/etlt/d17_july24_yolov4_resnet18_epoch_080.etlt -k nvidia_tlt -e /workspace/tao-experiments/specs/d17-jul24-yolov4-960-544-config_v4.1.txt --data_type int8 --cal_cache_file /workspace/tao-experiments/output/etlt/d17_july24_yolov4_resnet18_epoch_080.bin --engine_file /workspace/tao-experiments/output/etlt/d17_july24_yolov4_resnet18_epoch_080.engine.int8

Could you please follow GPU-optimized AI, Machine Learning, & HPC Software | NVIDIA NGC to run?

!tao-deploy yolo_v4 gen_trt_engine -m $USER_EXPERIMENT_DIR/export_qat/yolov4_resnet18_epoch_$EPOCH.etlt \
                                   -k $KEY \
                                   -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti_qat.txt \
                                   --data_type int8 \
                                   --batch_size 8 \
                                   --min_batch_size 1 \
                                   --opt_batch_size 8 \
                                   --max_batch_size 16 \
                                   --cal_json_file $USER_EXPERIMENT_DIR/export_qat/cal.json \
                                   --engine_file $USER_EXPERIMENT_DIR/export_qat/trt.engine.int8

adithya.ajith · August 9, 2023, 4:15pm

I have the int8 calibration file generated, this is why I ran the below command. Tell me if the command is wrong.

adithya.ajith:

tao-deploy yolo_v4 gen_trt_engine -m /workspace/tao-experiments/output/etlt/d17_july24_yolov4_resnet18_epoch_080.etlt -k nvidia_tlt -e /workspace/tao-experiments/specs/d17-jul24-yolov4-960-544-config_v4.1.txt --data_type int8 --cal_cache_file /workspace/tao-experiments/output/etlt/d17_july24_yolov4_resnet18_epoch_080.bin --engine_file /workspace/tao-experiments/output/etlt/d17_july24_yolov4_resnet18_epoch_080.engine.int8

Morganh · August 9, 2023, 4:37pm

I will check if I can reproduce the issue with the notebook GPU-optimized AI, Machine Learning, & HPC Software | NVIDIA NGC.
If possible, please check if you can run this notebook successfully as well.

Morganh · August 9, 2023, 4:44pm

Hi,
The --cal_json_file is needed.
See the end of YOLOv4 - NVIDIA Docs.

QAT Export Mode Required Arguments

--cal_json_file: The path to the json file containing tensor scale for QAT models. This argument is required if engine for QAT model is being generated.

Note

When exporting a model trained with QAT enabled, the tensor scale factors to calibrate the activations are peeled out of the model and serialized to a JSON file defined by the cal_json_file argument.

adithya.ajith · August 14, 2023, 3:32am

Now the command runs without any error and the int8 trt engine is created. I ran tao yolo_v4 inference with the engine file and it runs and I am getting the inference on the frames. I guess the issue is not with the etlt file and the key is correct.

Morganh · August 14, 2023, 3:18pm

Thanks for the info. Glad to know it is working now in tao.
For deepstream, please config eltt and key in https://github.com/NVIDIA-AI-IOT/deepstream_tao_apps/blob/release/tao4.0_ds6.2ga/configs/yolov4_tao/pgie_yolov4_tao_config.txt#L31-L33 and run with GitHub - NVIDIA-AI-IOT/deepstream_tao_apps at release/tao4.0_ds6.2ga

adithya.ajith · August 16, 2023, 7:16am

I have already configured the etlt file and key in the nvinfer config.Please refer the nvinfer config I have shared.

Morganh · August 16, 2023, 7:33am

ok, to narrow the issue, could you please download the official yolov4 model and then config it and check if it can run successfully?
Download it from: https://github.com/NVIDIA-AI-IOT/deepstream_tao_apps/blob/release/tao4.0_ds6.2ga/download_models.sh#L44
Config file: https://github.com/NVIDIA-AI-IOT/deepstream_tao_apps/blob/release/tao4.0_ds6.2ga/configs/yolov4_tao/pgie_yolov4_tao_config.txt

Please run with GitHub - NVIDIA-AI-IOT/deepstream_tao_apps at release/tao4.0_ds6.2ga

Topic		Replies	Views
Convert TAO Yolov4 model to DLA engine fails TAO Toolkit	22	1668	March 1, 2022
Unable to export QAT yolov3 in int8 TAO Toolkit	7	552	April 25, 2023
Deploy yolo_v4 to Deepstream 7.0 DeepStream SDK jetson , deepstream	25	50	March 1, 2025
Tao deploy error - TAO Toolkit jetson , deepstream	40	99	March 5, 2025
Deepstream infrence gives no detection TAO Toolkit	28	1934	December 9, 2021
Not able to run Tao trained model in Deepstream pipeline TAO Toolkit tensorrt , tao , deepstream	11	645	March 22, 2023
Inference YOLO_v4 int8 mode doesn't show any bounding box TAO Toolkit	31	2546	November 12, 2021
Error in integrating Yolov4 in Deepstream 6, 6.1, 6.1.1, and 6.2 TAO Toolkit	14	858	March 21, 2023
TAO Toolkit 4.0, Yolo v4 not working with Deepstream nvinfer - parseModel: Failed to parse ONNX model TAO Toolkit onnx , tao , deepstream61	6	964	February 20, 2023
[TLT-yolov4-Deepstream ] ERROR: [TRT]: UffParser: Unsupported number of graph 0 TAO Toolkit	16	759	October 12, 2021

Unable to deploy TAO 4.0.1 yolov4 model on deepstream6.0

QAT Export Mode Required Arguments

Related topics