Deepstream 6 YOLO performance issue

I can reproduce this issue. We are checking it, will get back to you ASAP.

Thanks!

I know you guys are busy
But
Any updates?

Hi @adventuredaisy ,
We are checking this with priority.
So far, we have found the inference time of some layers are much longer on DS6.0GA than DS5.1, we are working for the fix.

Thanks for the update

I know you guys will win the day!

1 Like

How’s the fix coming?
Is there a timetable for when It will be rolled out.?

Hey guys
I know you are working on the issue.
But I am dead in the water with DS6 when it comes to updating my YOLO projects from DS5.1 to Deepstream 6.
Just wondering when a fix would be coming out.

Thanks
Joe Valdivia

Noted! Sorry! Still wroking on it… will get back to you ASAP

Hi @adventuredaisy ,
This issue is still under debugging, it may be related to the nvdsinfer_custom_impl_Yolo/trt_utils.cpp which build the TensorRT model from the cfg file.
If this is urgent for you, is it possible for you to try TAO Yolov3 network - GitHub - NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream ?

Thanks for the update.
I’m not in any bind at the moment.
I thought I would explore Omniverse Issac sim in the meantime.
Kind of excited about the Synthetic data generator that’s coming out.

@mchi Any update on the issue?

Hi All,
Sorry for long delay!

Attached the fix for this perf regression issue. Verified on my side.

$ cd /opt/nvidia/deepstream/deepstream-6.0/sources/objectDetector_Yolo/
$ patch -p1 < DS6.0GA_objectDetector_Yolo_perf_regression.patch
$ export CUDA_VER= // specify the CUDA version, e.g. export CUDA_VER=11.4
$ make -C nvdsinfer_custom_impl_Yolo

DS6.0GA_objectDetector_Yolo_perf_regression.patch (2.5 KB)

Thanks!

2 Likes

how do I apply this.
I tried this:

/opt/nvidia/deepstream/deepstream-6.0/sources/objectDetector_Yolo$ sudo git apply /home/nx/DS6.0_objectDetector_Yolo_perf_regression.patch

but it returned this:

warning: nvdsinfer_custom_impl_Yolo/nvdsinfer_yolo_engine.cpp has type 100755, expected 100644
error: cannot apply binary patch to ‘nvdsinfer_custom_impl_Yolo/nvdsinfer_yolo_engine.o’ without full index line
error: nvdsinfer_custom_impl_Yolo/nvdsinfer_yolo_engine.o: patch does not apply
warning: nvdsinfer_custom_impl_Yolo/yolo.cpp has type 100755, expected 100644
warning: nvdsinfer_custom_impl_Yolo/yolo.h has type 100755, expected 100644
error: cannot apply binary patch to ‘nvdsinfer_custom_impl_Yolo/yolo.o’ without full index line
error: nvdsinfer_custom_impl_Yolo/yolo.o: patch does not apply

Hi, @adventuredaisy
I applied the it by using this:

/opt/nvidia/deepstream/deepstream-6.0/sources/objectDetector_Yolo$ sudo patch -p1 < ~/DS6.0_objectDetector_Yolo_perf_regression.patch

I checked that the code changed after running the patch file.
but I couldn’t get any difference even after applying it.

I still getting about 11 fps…
image

1 Like

Sorry! The patch includes some .o files, and generated under docker, so there are patch apply error to .o files & warining about file permission change. You can ignore these error and warning.
I went through the log, seems the patch was applied successfully .

You need to rebuild the libnvdsinfer_custom_impl_Yolo.so with
$ export CUDA_VER= …
$ make -C nvdsinfer_custom_impl_Yolo

Sorry all!

I updated the patch and steps to apply the patch in post#22 (Deepstream 6 YOLO performance issue - #22 by mchi) above.

1 Like

My bad! It shows about 55 fps after rebuild.
Thanks!

Beautiful
55 fps also
My faith in you’re skills to surmount this problem never wavered
OK maybe a little bit
But you didn’t let us down.

Thank you

2 Likes

Could I have this issue when using the container nvcr.io/nvidia/deepstream:6.0-triton?

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.