Darknet slower using Jetpack 4.4 (cuDNN 8.0.0 / CUDA 10.2) than Jetpack 4.3 (cuDNN 7.6.3 / CUDA 10.0)

Hi

This might be an issue directly with darknet, I’ve also opened an issue there: https://github.com/AlexeyAB/darknet/issues/5426 , just let me know in this case I will close this.

Problem

When I use darknet and yolov3-tiny on my jetson nano with the latest Jetpack 4.4 DP I get worse performance than with Jetpack 4.3. My guess is that is related to cuDNN 8.0.0 vs 7.x

I get 6.6 FPS with Jetpack 4.4 DP whereas I get 16.3 FPS with Jetpack 4.3

Complete procedure to reproduce

jetpack 4.4 (cuDNN: 8.0.0 CUDA 10.2) with CUDNN=1 flag:

darknet build with:

GPU=1
CUDNN=1
OPENCV=1
ARCH= -gencode arch=compute_53,code=[sm_53,compute_53]

result for yolov3-tiny ./darknet detector demo cfg/coco.data cfg/yolov3-tiny.cfg yolov3-tiny.weights <videoinput>

CUDA-version: 10020 (10020), cuDNN: 8.0.0, GPU count: 1  
OpenCV version: 4.1.1

FPS:6.3

cuDNN: 8.0.0 CUDA 10.2 with CUDNN=0 flag (jetpack 4.4):

darknet build with:

GPU=1
CUDNN=0
OPENCV=1
ARCH= -gencode arch=compute_53,code=[sm_53,compute_53]

result for yolov3-tiny ./darknet detector demo cfg/coco.data cfg/yolov3-tiny.cfg yolov3-tiny.weights <videoinput>

CUDA-version: 10020 (10020), GPU count: 1  
OpenCV version: 4.1.1

FPS:13.3

cuDNN: 7.6.3 CUDA 10 with CUDNN=1 flag (jetpack 4.4):

GPU=1
CUDNN=1
OPENCV=1
ARCH= -gencode arch=compute_53,code=[sm_53,compute_53]

result for yolov3-tiny ./darknet detector demo cfg/coco.data cfg/yolov3-tiny.cfg yolov3-tiny.weights <videoinput>

CUDA-version: 10000 (10000), cuDNN: 7.6.3, GPU count: 1  
 OpenCV version: 4.1.1

 FPS:16.4

cuDNN: 7.6.3 CUDA 10 with CUDNN=0 flag (jetpack 4.3):

GPU=1
CUDNN=0
OPENCV=1
ARCH= -gencode arch=compute_53,code=[sm_53,compute_53]

result for yolov3-tiny ./darknet detector demo cfg/coco.data cfg/yolov3-tiny.cfg yolov3-tiny.weights <videoinput>

CUDA-version: 10000 (10000), cuDNN: 7.6.3, GPU count: 1  
 OpenCV version: 4.1.1

FPS:13.3

Thanks

Hi,

Thanks for your report.
We are going to reproduce this issue and update more information with you later.

Thanks.

Hi,

May I know the setting of this experiment:

cuDNN: 7.6.3 CUDA 10 with CUDNN=1 flag (jetpack 4.4):

How do you install cuDNN 7.6.3 and CUDA 10.0 with JetPack4.4?
Thanks.

1 Like

Hi, Thanks.

This is a mistake, I meant jetpack 4.3 … Somehow I can’t edit the post

cuDNN: 7.6.3 CUDA 10 with CUDNN=1 flag (jetpack 4.3):

Basicaly I flashed to jetpack 4.3 , did test, then erase everything and flashed to jetpack 4.4 and did the same tests.

Hi,

So the performance without cuDNN is similar.
But with cuDNN drop from 16fps into 6fps, is that correct?

We are trying to reproduce this issue.
Will update more information with you later.

Thanks.

Yes this is what I experienced, a 10 FPS drop.

Many thanks !

Hi,

Just want to keep you updated.

This issue can be reproduced in our environment with Xavier.
We are going to check which cuDNN call causes the performance drop.

Thanks.

2 Likes

Good to know ! Good luck ! No problem for now I’m sticking with Jetpack 4.3 ;-)

2 Likes

Hi, I have plan to test darknet with Xavier NX. I have found that there is only 4.4 DP for for Xavier NX. Is it possible that this issue still to Xavier NX?

Hi,

We are working on this.
Will keep this topic updated once we solve this.

Thanks.

I am likewise experiencing ~40% performance drop using a different detection model, with efficientnet backbone and a customized Retina head.

Same problem on PyTorch 1.4
I am follow this page to install PyTorch:

I am testing the YOLOv5:

and found JetPack 4.4 inference time is about 0.25s
the JetPack 4.3 inference time is about 0.14s

Same to me, using our Mask_RCNN(tensorflow) based software on Jetson Xavier AGX.
With JetPack4.4 (not DP), the software do inference in 0.7FPS, while same software process in 1.6FPS with JetPack4.3.

Anything updated in this topic?

Hi,

You can find some information in our cuDNN release notes.
https://docs.nvidia.com/deeplearning/cudnn/release-notes/rel_8.html#rel-800-Preview__section_f3r_df1_5kb

Known Issues

   ...
  • The performance of cudnnConvolutionBiasActivationForward() is slower than v7.6 in most cases. This is being actively worked on and performance optimizations will be available in the upcoming releases.

Thanks.

Hi,

I have the same problem and fps drops using jetpack 4.4 and cudnn.

Is there an updated version of cudnn available and how to install it on jetson?

Thank you.

Hi,

Currently, our latest software is JetPack 4.4 product release which includes cuDNN v8.0.0.
Thanks.

Is the bug fixed in cudnn 8.0.2 and if yes, can I update the cudnn on the jetson?

Thanks.

I installed old version of cudnn over jetpack 4.4 and performance is fine again:

1 Like

Hello.
my version is Jetpack 4.4
I have same issue(performance is slower than cudnn7).
So I want to down grade my cudnn version but I don’t know how to delete current cudnn package.
How do you reinstall cudnn7.6.5 without reflashing?