Deepstream 6 YOLO performance issue

adventuredaisy · November 7, 2021, 12:41am

I have just downloaded deepstream 6 and I am having some performance issues running the YOLO example
I have been trying to run the YOLO example app in deepstream 6 and the video performance is extremely slow.
When I run the example on my Xavier NX with deepstream 5.0 it runs around 56fps
The same file running on same NX with deepstream 6.0 runs around 6 fps
Here is a video to explain:

nvplayer · November 7, 2021, 1:07pm

Hi! @adventuredaisy , your issue is very weird. I use ds6.0 running Yolov3 custom model very normally, I test that my 20 class custom Yolov3 fps is about 15，Yolov3-tiny will get higher fps!

I suggest you use the Jetpack4.6 version, and modify your main config file for deepstream-app, I ever notice a low fps case in a bad RTSP pipeline. many plugins can cause low fps, you can share your pipeline or your main config!

I’m using ds6.0 for my company, I’m willing to hear your progress!

adventuredaisy · November 7, 2021, 1:44pm

I am running the sample apps that come with deepstream:
I have not made any modifications to it.

When I run this example on my Xavier NX using jetpack 4.6 and Deepstream 5.0 I get 58 fps

/opt/nvidia/deepstream/deepstream-5.0/sources/oblectDetector_Yolo

But when I run this example on the same Xavier NX using jetpack 4.6 and deepstream 6.0 I can only get 6 fps

/opt/nvidia/deepstream/deepstream-6.0/sources/oblectDetector_Yolo

I have worked with the Yolo model using deepstream since deepstream first came out.
I have always achieved excellent performance from Deepstream and the YOLO applications.

That is why I find this odd. I have checked on every thing I could think of to see what is throttling the performance
and the only thing I can find different is the use of Deepstream 6.

mchi · November 8, 2021, 12:38am

Thanks, @nvplayer !

@adventuredaisy ,
Could you share the output of
$ cat /etc/nv_tegra_release

did you boost the clock ?
$ sudo nvpmodel -m 0
$ sudo jetson_clocks

And, please share the output of “sudo tegrastats” when the issue is reprodcuing.

And, how about the fps with command
$ /usr/src/tensorrt/bin/trtexec --loadEngine=$TRT_ENGINE_GENERATED_IN_DeepStream

Thanks!

adventuredaisy · November 8, 2021, 1:38am

mchi
attached is screen shot with

$ cat /etc/nv_tegra_release

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

results

sudo nvpmodel -m 0 is giving this result

nx@nx-desktop:~$ sudo nvpmodel -m 0
NVPM WARN: patching tpc_pg_mask: (0x1:0x4)
NVPM WARN: patched tpc_pg_mask: 0x4

will work on “sudo tegrastats” info

adventuredaisy · November 8, 2021, 1:54am

RAM 6243/7773MB (lfb 160x4MB) SWAP 44/3887MB (cached 0MB) CPU [25%@1190,43%@1190,off,off,off,off] EMC_FREQ 60%@1600 GR3D_FREQ 99%@1109 NVDEC 665 NVDEC1 665 VIC_FREQ 0%@115 APE 150 MTS fg 1% bg 5% AO@38C GPU@39.5C PMIC@50C AUX@36C CPU@38C thermal@37.5C VDD_IN 15336/15336 VDD_CPU_GPU_CV 8370/8370 VDD_SOC 2164/2164
RAM 6243/7773MB (lfb 160x4MB) SWAP 44/3887MB (cached 0MB) CPU [27%@1900,34%@1903,off,off,off,off] EMC_FREQ 61%@1600 GR3D_FREQ 11%@1109 NVDEC 665 NVDEC1 665 VIC_FREQ 38%@115 APE 150 MTS fg 1% bg 6% AO@38.5C GPU@39C PMIC@50C AUX@36C CPU@38C thermal@38C VDD_IN 14804/15070 VDD_CPU_GPU_CV 7880/8125 VDD_SOC 2123/2143
RAM 6243/7773MB (lfb 160x4MB) SWAP 44/3887MB (cached 0MB) CPU [26%@1651,41%@1497,off,off,off,off] EMC_FREQ 63%@1600 GR3D_FREQ 99%@1109 NVDEC 665 NVDEC1 665 VIC_FREQ 0%@115 APE 150 MTS fg 1% bg 8% AO@38.5C GPU@40C PMIC@50C AUX@36.5C CPU@38C thermal@38.15C VDD_IN 15336/15158 VDD_CPU_GPU_CV 8248/8166 VDD_SOC 2204/2163
RAM 6243/7773MB (lfb 160x4MB) SWAP 44/3887MB (cached 0MB) CPU [28%@1904,32%@1904,off,off,off,off] EMC_FREQ 63%@1600 GR3D_FREQ 94%@1109 NVDEC 665 NVDEC1 665 VIC_FREQ 87%@115 APE 150 MTS fg 1% bg 6% AO@39C GPU@40.5C PMIC@50C AUX@36.5C CPU@38C thermal@38.15C VDD_IN 15090/15141 VDD_CPU_GPU_CV 8125/8155 VDD_SOC 2123/2153
RAM 6243/7773MB (lfb 160x4MB) SWAP 44/3887MB (cached 0MB) CPU [40%@1903,31%@1904,off,off,off,off] EMC_FREQ 64%@1600 GR3D_FREQ 99%@1109 NVDEC 665 NVDEC1 665 VIC_FREQ 0%@115 APE 150 MTS fg 1% bg 6% AO@39C GPU@40.5C PMIC@50C AUX@36.5C CPU@38.5C thermal@38.2C VDD_IN 15376/15188 VDD_CPU_GPU_CV 8288/8182 VDD_SOC 2204/2163
RAM 6243/7773MB (lfb 160x4MB) SWAP 44/3887MB (cached 0MB) CPU [25%@1190,37%@1190,off,off,off,off] EMC_FREQ 63%@1600 GR3D_FREQ 99%@1109 NVDEC 665 NVDEC1 665 VIC_FREQ 0%@115 APE 150 MTS fg 1% bg 8% AO@39C GPU@40.5C PMIC@50C AUX@37C CPU@38.5C thermal@38.15C VDD_IN 15336/15213 VDD_CPU_GPU_CV 8329/8206 VDD_SOC 2164/2163
RAM 6243/7773MB (lfb 160x4MB) SWAP 44/3887MB (cached 0MB) CPU [33%@1904,34%@1903,off,off,off,off] EMC_FREQ 63%@1600 GR3D_FREQ 64%@1109 NVDEC 665 NVDEC1 665 VIC_FREQ 2%@115 APE 150 MTS fg 0% bg 0% AO@39C GPU@40.5C PMIC@50C AUX@37C CPU@39C thermal@38.5C VDD_IN 14927/15172 VDD_CPU_GPU_CV 8003/8177 VDD_SOC 2123/2157
RAM 6243/7773MB (lfb 160x4MB) SWAP 44/3887MB (cached 0MB) CPU [42%@1190,21%@1190,off,off,off,off] EMC_FREQ 64%@1600 GR3D_FREQ 99%@1109 NVDEC 665 NVDEC1 665 VIC_FREQ 0%@115 APE 150 MTS fg 1% bg 4% AO@39.5C GPU@41C PMIC@50C AUX@37C CPU@39C thermal@38.5C VDD_IN 15336/15192 VDD_CPU_GPU_CV 8288/8191 VDD_SOC 2204/2163
RAM 6243/7773MB (lfb 160x4MB) SWAP 44/3887MB (cached 0MB) CPU [34%@1903,36%@1904,off,off,off,off] EMC_FREQ

mchi · November 8, 2021, 1:47pm

I tried DS 6.0 GA on Jetson-NX/Jetpack4.6, its perf is about 14 fps as below.
I’ll check DS5.1 again.

root@nvidia-desktop:/opt/nvidia/deepstream/deepstream/sources/objectDetector_Yolo# deepstream-app -c deepstream_app_config_yoloV3.txt
…
NvMMLiteOpen : Block : BlockType = 261
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NvMMLiteBlockCreate : Block : BlockType = 261
** INFO: <bus_callback:180>: Pipeline running

**PERF: 16.61 (15.40)
**PERF: 14.07 (14.51)
**PERF: 14.19 (14.46)
**PERF: 14.49 (14.38)
**PERF: 14.24 (14.39)
**PERF: 13.46 (14.24)
**PERF: 14.56 (14.23)
**PERF: 14.48 (14.31)
**PERF: 13.85 (14.20)
**PERF: 14.31 (14.24)
**PERF: 14.00 (14.24)
**PERF: 14.35 (14.24)
**PERF: 14.15 (14.20)
**PERF: 13.88 (14.20)
**PERF: 14.26 (14.19)
**PERF: 14.10 (14.20)

adventuredaisy · November 8, 2021, 4:22pm

mchi
Here is a video comparing jetpack 4.6 and deepstream 6.0
With jetpack 4.6 and deepstream 5.1.
jetpack 4.6 and deepstream 5.1. wil run at 58 fps using same example

https://youtu.be/OLp9yxe0DTY

mchi · November 9, 2021, 10:54am

I can reproduce this issue. We are checking it, will get back to you ASAP.

Thanks!

adventuredaisy · November 11, 2021, 1:35pm

I know you guys are busy
But
Any updates?

mchi · November 11, 2021, 1:46pm

Hi @adventuredaisy ,
We are checking this with priority.
So far, we have found the inference time of some layers are much longer on DS6.0GA than DS5.1, we are working for the fix.

adventuredaisy · November 11, 2021, 4:02pm

Thanks for the update

I know you guys will win the day!

adventuredaisy · November 15, 2021, 3:58pm

How’s the fix coming?
Is there a timetable for when It will be rolled out.?

adventuredaisy · November 16, 2021, 3:28pm

Hey guys
I know you are working on the issue.
But I am dead in the water with DS6 when it comes to updating my YOLO projects from DS5.1 to Deepstream 6.
Just wondering when a fix would be coming out.

Thanks
Joe Valdivia

mchi · November 16, 2021, 3:30pm

Noted! Sorry! Still wroking on it… will get back to you ASAP

mchi · November 19, 2021, 1:32am

Hi @adventuredaisy ,
This issue is still under debugging, it may be related to the nvdsinfer_custom_impl_Yolo/trt_utils.cpp which build the TensorRT model from the cfg file.
If this is urgent for you, is it possible for you to try TAO Yolov3 network - GitHub - NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream ?

adventuredaisy · November 19, 2021, 2:15pm

Thanks for the update.
I’m not in any bind at the moment.
I thought I would explore Omniverse Issac sim in the meantime.
Kind of excited about the Synthetic data generator that’s coming out.

akv404 · November 24, 2021, 5:57am

@mchi Any update on the issue?

mchi · December 1, 2021, 4:39am

Hi All,
Sorry for long delay!

Attached the fix for this perf regression issue. Verified on my side.

$ cd /opt/nvidia/deepstream/deepstream-6.0/sources/objectDetector_Yolo/
$ patch -p1 < DS6.0GA_objectDetector_Yolo_perf_regression.patch
$ export CUDA_VER= // specify the CUDA version, e.g. export CUDA_VER=11.4
$ make -C nvdsinfer_custom_impl_Yolo

DS6.0GA_objectDetector_Yolo_perf_regression.patch (2.5 KB)

Thanks!

adventuredaisy · December 2, 2021, 1:59am

how do I apply this.
I tried this:

/opt/nvidia/deepstream/deepstream-6.0/sources/objectDetector_Yolo$ sudo git apply /home/nx/DS6.0_objectDetector_Yolo_perf_regression.patch

but it returned this:

warning: nvdsinfer_custom_impl_Yolo/nvdsinfer_yolo_engine.cpp has type 100755, expected 100644
error: cannot apply binary patch to ‘nvdsinfer_custom_impl_Yolo/nvdsinfer_yolo_engine.o’ without full index line
error: nvdsinfer_custom_impl_Yolo/nvdsinfer_yolo_engine.o: patch does not apply
warning: nvdsinfer_custom_impl_Yolo/yolo.cpp has type 100755, expected 100644
warning: nvdsinfer_custom_impl_Yolo/yolo.h has type 100755, expected 100644
error: cannot apply binary patch to ‘nvdsinfer_custom_impl_Yolo/yolo.o’ without full index line
error: nvdsinfer_custom_impl_Yolo/yolo.o: patch does not apply

Topic		Replies	Views
Deepstream 6.0 Python Yolo bad performance DeepStream SDK	8	1668	December 28, 2021
Deepstream and JetPack 3.3 DeepStream SDK	33	5015	January 29, 2019
Deepstream SDK for Yolov4 losing detections after 5 streams DeepStream SDK tensorrt , cuda	9	381	October 10, 2023
Yolo for deepstream-app DeepStream SDK	27	9108	October 12, 2021
Instructions to integrate TAO 3.0 YoloV4 model into DeepStream produce no output on Jetson NX DeepStream SDK	10	394	December 5, 2023
DeepStream 6.0 - streaming stopped, reason not-negotiated (-4) DeepStream SDK	6	2794	December 2, 2021
Deepstream Yolo output is not generating DeepStream SDK	4	380	January 6, 2023
No detections on deepstream (nvinfer) with nvdspreprocess plugin add DeepStream SDK deepstream	17	185	October 28, 2024
deepstream-yolo-app installation error on Jetson DeepStream SDK	14	2482	October 12, 2021
Unable to start Yolo8 deepstream with MJpeg AVI DeepStream SDK jetson-inference , gstreamer	14	865	September 8, 2023

Deepstream 6 YOLO performance issue

Related topics