I converted a custom yolov3 model (weights & config) to onnx, then converted this onnx model to a tensorrt engine per: https://docs.nvidia.com/deeplearning/sdk/tensorrt-sample-support-guide/index.html#yolov3_onnx
The onnx_to_tensorrt.py provided in the samples loads the saved tensorrt engine (FP32 precision) and performs inference on a sample image correctly.
However, when I try to load this model engine using deepstream by modifying the sample: /opt/nvidia/deepstream/deepstream-4.0/sources/objectDetector_Yolo/ I don’t see the expected results (very poor/almost no detection and incorrect bounding boxes). I did follow the instructions specified in: https://docs.nvidia.com/metropolis/deepstream/Custom_YOLO_Model_in_the_DeepStream_YOLO_App.pdf
and compiled the nvdsinfer_custom_impl_Yolo project. I also modified the deepstream_app_config_yoloV3.txt & config_infer_primary_yoloV3.txt to handle the custom model.
I’m not sure if I’m missing something, will appreciate any ideas/suggestion to try.
Although they have the same output (obj_threshold is still different), I’m afraid the PostprocessYOLO() in data_processing.py for yolov3_onnx TRT sample is much different from the post-processing of DeepStream, you could compare them. So, you use the TRT yolov3_onnx model with a unmatched post-processing, it will be expected to be failed.
Thanks for your reply. I was told in one of the DeepStream GTC sessions that if a Yolov3 model can be processed by TensorRT APIs then it should work within DeepStream as well. So I assumed that TRT Engines have interfaces that allowed them to be used by DeepStream if they work within tensorrt. Anyway, I will take a look at the post-processing functions in TRT samples and the DeepStream example to see why it’s not working in DeepStream.
DeepStream wraps TensorRT for inference, so if a model can run with TensorRT, it can be used in DeepStream. I think this is his point. It’s just about inference, not including pre&post-processing since pre&post-processing is out of TensorRT and may be different for even the same kind of network.
BTW, may I know why you want to use the YoloV3 of TRT yolov3_onnx sample? Is it just a practice or …?
That’s good to know.
The resources from nvidia for loading yolov3 models on to jetson platforms pointed me to the yolov3_onnx sample and later I found the /opt/nvidia/deepstream/deepstream-4.0/sources/objectDetector_Yolo/ also does something similar. However, the deepstream sample has no mechanism for performing INT8 calibration so I worked with the tensorrt samples to create a custom INT8 model and calibration table. I tried loading the model and calibration with the deepstream sample and encountered issues. Interestingly, I was able to generate the INT8 model from the deepstream sample and use the tensorrt sample based calibration table.
The samples for yolov3 are helpful but based on the READMEs and available documentation I couldn’t piece together the points you have shared.
I also have a related issue with the deepstream YOLO sample using INT8 batch-size > 1 (Batch Size Failure in Custom YOLOv3 INT8). I was wondering if you had any thoughts on that as well.
Thank you so much for your reply !
Confirmed that the DS(DeepStream) YoloV3 and TRT yolov3-onnx sample are based on the exactly same network and model (https://raw.githubusercontent.com/pjreddie/darknet/master/cfg/yolov3.cfg and https://pjreddie.com/media/files/yolov3.weights), but DS parses the yolov3.cfg and does some modification.
You can use below two commands to profile the TRT engine generated by them, you can find they have differnet layer number, and the last layer are different (DS YOLOV3 has a plugin to help post-precessing, TRT YOLOV3 use python tool to handle it).
- DeepStream yolov3:
$ tensorrt/bin/trtexec --loadEngine=model_b4_fp32.engine --dumpProfile --plugins=/opt/nvidia/deepstream/deepstream-4.0/sources/objectDetector_Yolo/nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
- TRT yolov3
$ ~/tensorrt/bin/trtexec --loadEngine=model_b1_fp32.engine --dumpProfile
I didn’t look into the detailed difference, but I believe the real models running in DS and TRT YoloV3 are different. These difference and the post-processing causes the issue you saw above.
And, as you can find under /opt/nvidia/deepstream/deepstream-4.0/sources/objectDetector_Yolo/, there is yolov3-calibration.table.trt5.1, which is the INT8 calibration table for DS to run its Yolov3 under INT8 mode.
For topic - (Batch Size Failure in Custom YOLOv3 INT8), my colleage can’t reproduce the issue you mentioned, could you provide more details about the repo.
This is great. Thank you so much for your detailed explanation. It’s good to know there are some differences between the TRT YOLOv3 python sample and the deepstream sample. I will take a look at the trtexec tool per your suggestion to understand the differences further.
Given these differences and limitation of the two samples what’s the recommended way to load a custom YOLOv3 model in deepstream in INT8 precision and how to generate the calibration table that can be used with deepstream ? At this point I’m able to load my custom model in deepstream with the yolo sample, I’m not clear how to generate a calibration table that’s compatible with the deepstream sample.
Thanks again !
DeepStream wraps TensorRT for inference. INT8 calibration to generate INT8 calibration is actually done by TensorRT, so you could refer to TensorRT doc - https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#optimizing_int8_c and TensorRT sample - sampleINT8 about how to generate INT8 calibration table which can be used in DeepStream. Note, TensorRT version you use to do the INT8 calibration need to be the same version as DeepStream uses.
With the model, INT8 calibration table, you need to also take care the pre&post-processing code for the custom model.
BTW, with DeepStream 5.0 and TLT2.0 release, we created a project about running TLT2.0 models with DeepStream5.0 - https://github.com/NVIDIA-AI-IOT/deepstream_tlt_apps . As you can find in this project, for TLT2.0 YoloV3, there is its own post-processing code under the nvdsinfer_customparser_yolov3_tlt folder.
That’s very helpful to know. Will keep in mind the need to keep the TRT model and INT8 calibration in sync for DeepStream. I will also check out the TLT 2.0 release. Thank you very much !