Improved DeepStream for YOLO models

marcoslucianops · February 14, 2022, 1:26am

Hi, I would like to share my work improving the DeepStream for YOLO models provided by NVIDIA.

Link: GitHub - marcoslucianops/DeepStream-Yolo: NVIDIA DeepStream SDK 6.1 / 6.0.1 / 6.0 configuration for YOLO models

Improvements on this repository

Darknet CFG params parser (no need to edit nvdsparsebbox_Yolo.cpp or another file)
Support for new_coords, beta_nms and scale_x_y params
Support for new models
Support for new layers
Support for new activations
Support for convolutional groups
Support for INT8 calibration
Support for non square models
Support for implicit and channel layers (YOLOR)
YOLOv5 6.0 native support
Initial YOLOR native support
Models benchmarks

Future updates

New documentation for multiple models
DeepStream tutorials
Native PP-YOLO support
GPU NMS
Dynamic batch-size

Tested models

Benchmarks

Board: NVIDIA GTX 1050 4GB (Mobile)

YOLOR-CSP performance comparison

	DeepStream	PyTorch
FPS (without display)	13.32	10.07
FPS (with display)	12.63	9.41

YOLOv5n performance comparison

	DeepStream	TensorRTx	Ultralytics
FPS (without display)	110.25	87.42	97.19
FPS (with display)	105.62	73.07	50.37

More

DeepStream	Precision	Resolution	IoU=0.5:0.95	IoU=0.5	IoU=0.75	FPS (without display)
YOLOR-P6	FP32	1280	0.478	0.663	0.519	5.53
YOLOR-CSP-X*	FP32	640	0.473	0.664	0.513	7.59
YOLOR-CSP-X	FP32	640	0.470	0.661	0.507	7.52
YOLOR-CSP*	FP32	640	0.459	0.652	0.496	13.28
YOLOR-CSP	FP32	640	0.449	0.639	0.483	13.32
YOLOv5x6 6.0	FP32	1280	0.504	0.681	0.547	2.22
YOLOv5l6 6.0	FP32	1280	0.492	0.670	0.535	4.05
YOLOv5m6 6.0	FP32	1280	0.463	0.642	0.504	7.54
YOLOv5s6 6.0	FP32	1280	0.394	0.572	0.424	18.64
YOLOv5n6 6.0	FP32	1280	0.294	0.452	0.314	26.94
YOLOv5x 6.0	FP32	640	0.469	0.654	0.509	8.24
YOLOv5l 6.0	FP32	640	0.450	0.634	0.487	14.96
YOLOv5m 6.0	FP32	640	0.415	0.601	0.448	28.30
YOLOv5s 6.0	FP32	640	0.334	0.516	0.355	63.55
YOLOv5n 6.0	FP32	640	0.250	0.417	0.260	110.25
YOLOv4-P6	FP32	1280	0.499	0.685	0.542	2.57
YOLOv4-P5	FP32	896	0.472	0.659	0.513	5.48
YOLOv4-CSP-X-SWISH	FP32	640	0.473	0.664	0.513	7.51
YOLOv4-CSP-SWISH	FP32	640	0.459	0.652	0.496	13.13
YOLOv4x-MISH	FP32	640	0.459	0.650	0.495	7.53
YOLOv4-CSP	FP32	640	0.440	0.632	0.474	13.19
YOLOv4	FP32	608	0.498	0.740	0.549	12.18
YOLOv4-Tiny	FP32	416	0.215	0.403	0.206	201.20
YOLOv3-SPP	FP32	608	0.411	0.686	0.433	12.22
YOLOv3-Tiny-PRN	FP32	416	0.167	0.382	0.125	277.14
YOLOv3	FP32	608	0.377	0.672	0.385	12.51
YOLOv3-Tiny	FP32	416	0.095	0.203	0.079	218.42
YOLOv2	FP32	608	0.286	0.541	0.273	25.28
YOLOv2-Tiny	FP32	416	0.102	0.258	0.061	231.36

mchi · February 14, 2022, 9:51am

Amazing! Really appreciate for your work on DeepStream! This is helpful, I’ll share this internally!

One question, in your test table, YoloV4/3/2 are from Darknet YOLO , right?

marcoslucianops · February 14, 2022, 9:59am

Yes

marcoslucianops · February 17, 2022, 6:34pm

Moved from CPU to GPU to get better performance

Results

4x faster inference in AGX using YOLOv5n model in FP16 mode

CPU YOLO Decoder	GPU YOLO Decoder

marcoslucianops · February 22, 2022, 2:54am

Update:

GPU YOLO Decoder (moved from CPU to GPU to get better performance) #138
Improved NMS #142

music1913 · March 10, 2022, 6:18am

I’m using TAO to trained a yolo_v4_tiny model and exported to .etlt file, I didn’t see too much info for how to deployed it to Jetson Nano (also the DS6 samples does not include samples for YOLO), does the deepstream6 not support or?

marcoslucianops · March 10, 2022, 3:21pm

For YOLOv4 trained in TAO you should you deepsteam_tao_apps

music1913 · March 11, 2022, 12:34am

@marcoslucianops thanks.
Your work: DeepStream-Yolo is a improvement to deepstream_tao_apps, is that correct?

marcoslucianops · March 11, 2022, 12:00pm

No, it’s a improvement of objectDetector_Yolo.

system · March 25, 2022, 12:01pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
deepstream-yolo-app performance vs Tensor-Core optimized yolo-darknet DeepStream SDK	9	3667	October 12, 2021
Develop a Python AI app using YOLOv3 and deepstream DeepStream SDK	10	547	October 12, 2021
How to use yolo model trained by darknet framework in deepstream? DeepStream SDK	2	330	October 12, 2021
About deepstream sdk with yolov5 Jetson Nano yolo , pytorch	2	1888	October 18, 2021
How to convert general yolov2,yoloV3 model to be used in deepstream DeepStream SDK	4	348	October 12, 2021
Deepstream 6.0 Python Yolo bad performance DeepStream SDK	8	1680	December 28, 2021
Does DeepStream supports custom YoLoV5 Algorithm? DeepStream SDK	5	1160	June 28, 2022
If Nvidia-deepstream has custom support for new model, YOLO v8 , YOLO-NAS DeepStream SDK	2	856	June 13, 2023
Yolo lower performance in DeepStream 5.0 DeepStream SDK	7	839	October 12, 2021
The recommended approach to implement YOLOv8 in deepstream DeepStream SDK	2	625	September 26, 2023