DeepStream YOLO Series End2End Models

Levi_Pereira · June 21, 2024, 6:33pm

Hello,

I’m sharing my new DeepStream repository, specifically designed for end-to-end YOLO models. This repository includes several advanced features that significantly enhance the scalability and performance of DeepStream on both dGPU and Jetson platforms. Here are some of the key benefits:

Dynamic Shapes: TensorRT enables the creation of network resolutions different from the original exported ONNX, providing flexibility in handling various input dimensions.
Dynamic Batch Size: This feature dynamically adjusts the batch size to maximize model performance according to the GPU’s capacity, ensuring optimal utilization of resources.
NMS-Free Models: Our repository includes models that natively implement NMS-Free technology, available for some YOLOv9 models and all YOLOv10 detection models, streamlining the inference process.
TensorRT Plugins: We have integrated TensorRT EfficientNMS plugins for detection models, and EfficientNMSX/ROIAlign plugins for segmentation models, enhancing the efficiency and accuracy of your applications.

These features offer significant potential for those looking to scale their use of DeepStream across various platforms. Whether you are deploying on dGPU or Jetson, our repository provides the tools necessary for efficient and high-performance implementations.

yingliu · June 22, 2024, 1:59pm

Thanks for your sharing to the community!

phylis7 · October 16, 2024, 5:56am

Thank you for sharing. I use the yolov9-seg you shared on jetson orin nano, deepstream7.0, and it can run normally, but I have a problem. The running speed is relatively slow, and the real-time inference result of the camera has a delay of about one second. Do you have any good solution?

Levi_Pereira · October 16, 2024, 1:16pm

You need to quantize my model to INT8 or FP8 and use a smaller model on the Jetson, as YOLO models tend to be quite resource-intensive for this device. I haven’t performed quantization on segmentation models yet, but you could try using TensorRT Model Optimizer to improve performance.

phylis7 · October 16, 2024, 1:54pm

‘bash scripts/onnx_to_trt.sh -p fp16 -f models/yolov9c-seg-trt.onnx -c config_pgie_yolo_det.txt’. I use this statement to convert the model into an engine file for fp16, am I doing the right thing? But after doing so, the generated file looks like it already belongs to fp32.

Levi_Pereira · October 17, 2024, 2:02pm

you are using wrong configuration file config_pgie_yolo_det.tx

change to
bash scripts/onnx_to_trt.sh -p fp16 -f models/yolov9c-seg-trt.onnx -c config_pgie_yolo_seg.txt'

Also you can change in runtime.
0 is default fp32

github.com

levipereira/deepstream-yolo-e2e/blob/master/config_pgie_yolo_seg.txt#L11


      
          [property]
          gpu-id=0
          net-scale-factor=0.0039215697906911373
          model-color-format=0
          onnx-file=models/yolov9-c-seg-trt.onnx
          model-engine-file=models/yolov9-c-seg-trt-netsize-640-batch-1.engine
          labelfile-path=labels.txt
          batch-size=1
          force-implicit-batch-dim=0
          infer-dims=3;640;640
          # 0: FP32 1: INT8 2: FP16
          network-mode=0
          num-detected-classes=80
          interval=0
          gie-unique-id=1
          process-mode=1
          # 0: Detector 1: Classifier 2: Segmentation 3: Instance Segmentation
          network-type=3
          # 0：Group Rectange 1：DBSCAN 2：NMS 3:DBSCAN+NMS 4:None
          cluster-mode=4
          maintain-aspect-ratio=1

phylis7 · October 18, 2024, 3:26am

Sorry, my computer was not with me when I replied before, so I copied the instructions from your github to explain how I did it, but I forgot to change the latter part, which caused confusion for you.
This is my actual command screenshot.And I also changed network-mode=2 in config_pgie_yolo_seg.txt.

But you noticed in the output of my generated engine file there is a line that says: [10/16/2024-14:59:15] [I] Precision: FP32+FP16.

After the generation is successful, I use the following statement to view the actual network-mode of the engine file I generated. Please see that there is a line of prompt: “[10/16/2024-15:32:11] [I] Precision: FP32”

Levi_Pereira · October 19, 2024, 8:05pm

@phylis7
Not all tensors will be converted to FP16; some nodes, such as input and output tensors, will remain in FP32. This is how TensorRT operates.

You can use TREx Exploring NVIDIA TensorRT Engines with TREx | NVIDIA Technical Blog to see detailed info about your model.

Topic		Replies	Views
How to evaluate .engine model on custom dataset? DeepStream SDK	12	1033	May 24, 2023
Profile inference time of each layer for .engine model to know where is bottleneck in Deepstream? DeepStream SDK	17	871	June 19, 2023
Integration model pose estimation in Deepstream SDK DeepStream SDK	15	1222	April 6, 2023
YOLOv3 TensorRT model in DeepStream DeepStream SDK	11	1080	October 12, 2021
Deepstream-YOLO-seg custom yolov8 onnx model not working as expected DeepStream SDK onnx , segmentation	22	2089	November 10, 2023
Load only .engine file DeepStream SDK	17	1232	June 24, 2022
Analyzing latency using Nsight systems in deepstream DeepStream SDK jetson , deepstream	30	66	April 14, 2025
Low fps when doing object detection on jetson nano Jetson Nano jetson-inference	19	9011	March 1, 2022
Possible Solutions to INT64 clamping accuracy drop DeepStream SDK tensorrt	11	525	March 11, 2024
Improved DeepStream for YOLO models DeepStream SDK	9	2310	March 25, 2022

DeepStream YOLO Series End2End Models

Related topics