Hello,
I’m sharing my new DeepStream repository, specifically designed for end-to-end YOLO models. This repository includes several advanced features that significantly enhance the scalability and performance of DeepStream on both dGPU and Jetson platforms. Here are some of the key benefits:
- Dynamic Shapes: TensorRT enables the creation of network resolutions different from the original exported ONNX, providing flexibility in handling various input dimensions.
- Dynamic Batch Size: This feature dynamically adjusts the batch size to maximize model performance according to the GPU’s capacity, ensuring optimal utilization of resources.
- NMS-Free Models: Our repository includes models that natively implement NMS-Free technology, available for some YOLOv9 models and all YOLOv10 detection models, streamlining the inference process.
- TensorRT Plugins: We have integrated TensorRT EfficientNMS plugins for detection models, and EfficientNMSX/ROIAlign plugins for segmentation models, enhancing the efficiency and accuracy of your applications.
These features offer significant potential for those looking to scale their use of DeepStream across various platforms. Whether you are deploying on dGPU or Jetson, our repository provides the tools necessary for efficient and high-performance implementations.
2 Likes
Thanks for your sharing to the community!
1 Like
Thank you for sharing. I use the yolov9-seg you shared on jetson orin nano, deepstream7.0, and it can run normally, but I have a problem. The running speed is relatively slow, and the real-time inference result of the camera has a delay of about one second. Do you have any good solution?
You need to quantize my model to INT8 or FP8 and use a smaller model on the Jetson, as YOLO models tend to be quite resource-intensive for this device. I haven’t performed quantization on segmentation models yet, but you could try using TensorRT Model Optimizer to improve performance.
‘bash scripts/onnx_to_trt.sh -p fp16 -f models/yolov9c-seg-trt.onnx -c config_pgie_yolo_det.txt’. I use this statement to convert the model into an engine file for fp16, am I doing the right thing? But after doing so, the generated file looks like it already belongs to fp32.
you are using wrong configuration file config_pgie_yolo_det.tx
change to
bash scripts/onnx_to_trt.sh -p fp16 -f models/yolov9c-seg-trt.onnx -c config_pgie_yolo_seg.txt'
Also you can change in runtime.
0 is default fp32
Sorry, my computer was not with me when I replied before, so I copied the instructions from your github to explain how I did it, but I forgot to change the latter part, which caused confusion for you.
This is my actual command screenshot.And I also changed network-mode=2 in config_pgie_yolo_seg.txt.
But you noticed in the output of my generated engine file there is a line that says: [10/16/2024-14:59:15] [I] Precision: FP32+FP16.
After the generation is successful, I use the following statement to view the actual network-mode of the engine file I generated. Please see that there is a line of prompt: “[10/16/2024-15:32:11] [I] Precision: FP32”
@phylis7
Not all tensors will be converted to FP16; some nodes, such as input and output tensors, will remain in FP32. This is how TensorRT operates.
You can use TREx Exploring NVIDIA TensorRT Engines with TREx | NVIDIA Technical Blog to see detailed info about your model.