YOLOv3 TensorRT model in DeepStream

dilip.s · April 23, 2020, 10:37pm

I converted a custom yolov3 model (weights & config) to onnx, then converted this onnx model to a tensorrt engine per: Sample Support Guide :: NVIDIA Deep Learning TensorRT Documentation
The onnx_to_tensorrt.py provided in the samples loads the saved tensorrt engine (FP32 precision) and performs inference on a sample image correctly.

However, when I try to load this model engine using deepstream by modifying the sample: /opt/nvidia/deepstream/deepstream-4.0/sources/objectDetector_Yolo/ I don’t see the expected results (very poor/almost no detection and incorrect bounding boxes). I did follow the instructions specified in: https://docs.nvidia.com/metropolis/deepstream/Custom_YOLO_Model_in_the_DeepStream_YOLO_App.pdf
and compiled the nvdsinfer_custom_impl_Yolo project. I also modified the deepstream_app_config_yoloV3.txt & config_infer_primary_yoloV3.txt to handle the custom model.

I’m not sure if I’m missing something, will appreciate any ideas/suggestion to try.

Thanks,
Dilip.

mchi · April 28, 2020, 2:21am

Hi Dilip.s,
Although they have the same output (obj_threshold is still different), I’m afraid the PostprocessYOLO() in data_processing.py for yolov3_onnx TRT sample is much different from the post-processing of DeepStream, you could compare them. So, you use the TRT yolov3_onnx model with a unmatched post-processing, it will be expected to be failed.

Thanks!

dilip.s · April 28, 2020, 10:46pm

Hello mchi,

Thanks for your reply. I was told in one of the DeepStream GTC sessions that if a Yolov3 model can be processed by TensorRT APIs then it should work within DeepStream as well. So I assumed that TRT Engines have interfaces that allowed them to be used by DeepStream if they work within tensorrt. Anyway, I will take a look at the post-processing functions in TRT samples and the DeepStream example to see why it’s not working in DeepStream.

-Dilip.

mchi · April 29, 2020, 1:01am

DeepStream wraps TensorRT for inference, so if a model can run with TensorRT, it can be used in DeepStream. I think this is his point. It’s just about inference, not including pre&post-processing since pre&post-processing is out of TensorRT and may be different for even the same kind of network.

BTW, may I know why you want to use the YoloV3 of TRT yolov3_onnx sample? Is it just a practice or …?

Thanks!

dilip.s · April 29, 2020, 6:33pm

Hello @mchi,

That’s good to know.

The resources from nvidia for loading yolov3 models on to jetson platforms pointed me to the yolov3_onnx sample and later I found the /opt/nvidia/deepstream/deepstream-4.0/sources/objectDetector_Yolo/ also does something similar. However, the deepstream sample has no mechanism for performing INT8 calibration so I worked with the tensorrt samples to create a custom INT8 model and calibration table. I tried loading the model and calibration with the deepstream sample and encountered issues. Interestingly, I was able to generate the INT8 model from the deepstream sample and use the tensorrt sample based calibration table.

The samples for yolov3 are helpful but based on the READMEs and available documentation I couldn’t piece together the points you have shared.

I also have a related issue with the deepstream YOLO sample using INT8 batch-size > 1 (Batch Size Failure in Custom YOLOv3 INT8). I was wondering if you had any thoughts on that as well.

Thank you so much for your reply !

mchi · April 30, 2020, 4:57am

Hi Dilip.s
Confirmed that the DS(DeepStream) YoloV3 and TRT yolov3-onnx sample are based on the exactly same network and model (https://raw.githubusercontent.com/pjreddie/darknet/master/cfg/yolov3.cfg and https://pjreddie.com/media/files/yolov3.weights), but DS parses the yolov3.cfg and does some modification.

You can use below two commands to profile the TRT engine generated by them, you can find they have differnet layer number, and the last layer are different (DS YOLOV3 has a plugin to help post-precessing, TRT YOLOV3 use python tool to handle it).

DeepStream yolov3:
$ tensorrt/bin/trtexec --loadEngine=model_b4_fp32.engine --dumpProfile --plugins=/opt/nvidia/deepstream/deepstream-4.0/sources/objectDetector_Yolo/nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
TRT yolov3
$ ~/tensorrt/bin/trtexec --loadEngine=model_b1_fp32.engine --dumpProfile

I didn’t look into the detailed difference, but I believe the real models running in DS and TRT YoloV3 are different. These difference and the post-processing causes the issue you saw above.

And, as you can find under /opt/nvidia/deepstream/deepstream-4.0/sources/objectDetector_Yolo/, there is yolov3-calibration.table.trt5.1, which is the INT8 calibration table for DS to run its Yolov3 under INT8 mode.

For topic - (Batch Size Failure in Custom YOLOv3 INT8), my colleage can’t reproduce the issue you mentioned, could you provide more details about the repo.

Thanks!

dilip.s · April 30, 2020, 11:20pm

Hello mchi,

This is great. Thank you so much for your detailed explanation. It’s good to know there are some differences between the TRT YOLOv3 python sample and the deepstream sample. I will take a look at the trtexec tool per your suggestion to understand the differences further.

Given these differences and limitation of the two samples what’s the recommended way to load a custom YOLOv3 model in deepstream in INT8 precision and how to generate the calibration table that can be used with deepstream ? At this point I’m able to load my custom model in deepstream with the yolo sample, I’m not clear how to generate a calibration table that’s compatible with the deepstream sample.

Thanks again !

mchi · May 3, 2020, 2:12am

Hi Dilip.s,
DeepStream wraps TensorRT for inference. INT8 calibration to generate INT8 calibration is actually done by TensorRT, so you could refer to TensorRT doc - Developer Guide :: NVIDIA Deep Learning TensorRT Documentation and TensorRT sample - sampleINT8 about how to generate INT8 calibration table which can be used in DeepStream. Note, TensorRT version you use to do the INT8 calibration need to be the same version as DeepStream uses.
With the model, INT8 calibration table, you need to also take care the pre&post-processing code for the custom model.

BTW, with DeepStream 5.0 and TLT2.0 release, we created a project about running TLT2.0 models with DeepStream5.0 - GitHub - NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream . As you can find in this project, for TLT2.0 YoloV3, there is its own post-processing code under the nvdsinfer_customparser_yolov3_tlt folder.

dilip.s · May 4, 2020, 5:07pm

Hello mchi,

That’s very helpful to know. Will keep in mind the need to keep the TRT model and INT8 calibration in sync for DeepStream. I will also check out the TLT 2.0 release. Thank you very much !

-Dilip.

Topic		Replies	Views
DeepStream implementation of general YoloV2 and YoloV3 to INT8 precision enginefile DeepStream SDK	37	1617	October 16, 2020
TLT YOLO v3 model cannot detect anything in Deepstream 5.0, JetPack 4.4 DeepStream SDK	2	715	July 18, 2020
DeepStream YoloV3 onnx do not work DeepStream SDK	5	1000	June 19, 2020
How to run yolov3-tiny.engine on tensorrt converted by run deepstream-app TensorRT	3	668	September 29, 2022
Unable to run Deepstream on a YoloV3 ONNX model DeepStream SDK tensorrt , cuda	3	1562	June 23, 2021
Using trtexec to convert yolov3.onnx to int8, fp16, engine in deepstream has the same effect, and the accuracy of detection is completely wrong DeepStream SDK jetson-inference	2	652	December 14, 2020
The performance on yolov3 in the test3 doesn't match the performance that using deepstream-app DeepStream SDK	9	721	May 6, 2020
Using custom model in deepstream DeepStream SDK jetson-inference , python , deepstream	39	861	September 10, 2024
YOLOV3 example in DeepStream-Triton Integration DeepStream SDK inference-server-triton	8	2539	May 5, 2021
Enable INT8 mode for a YOLO/ONNX model in DeepStream DeepStream SDK deepstream	5	296	September 26, 2025

YOLOv3 TensorRT model in DeepStream

Related topics