cuDNN 8.x.x vs cuDNN 7.6.5 performance drop

nrudakov · June 4, 2021, 3:14pm

There is a significant performance difference between cuDNN 7.6.5 and cuDNN 8.x.x. The program performs sequential calls of cuDNN convolution, batch normalization and activation functions. GPU is fully utilized when the program is using cuDNN 7. But huge time gaps appear between kernel executions with cuDNN 8. (see attached screenshot from Nsight Systems timeline bellow)

CUDA 10.2 with cuDNN 7.6.5 (no gaps, GPU is utilized efficiently)

CUDA 10.2 with cuDNN 8.0.2 (huge time gaps, not efficient GPU utilization)

Same problem exists with different CUDA 11.x and cuDNN 8.x.x versions.

Any ideas what could be the reason of the performance drop?

AakankshaS · June 15, 2021, 3:41am

Hi @nrudakov ,
Can you please share the logs with us for the same.

Thanks!

nrudakov · June 16, 2021, 7:05am

Hi!

Here is a OneDrive link to a zipped folder with several reports from Nsight Systems 2021.1.3. Profiling was done with different combinations of GPU card, CUDA and cuDNN.

(Microsoft OneDrive - Access files anywhere. Create docs with free Office Online.)

Each report has 5 consecutive inference runs of a CNN neural network:
In report “10.2-8.0.2-rtx-2080ti” look at the timeline around 18.370
In report “10.2-7.6.5-rtx-2080ti” look at the timeline around 12.420
In report “10.0-7.6.5-rtx-2080ti” look at the timeline around 4.200
In report “11.3-8.2.0-rtx-2080ti” look at the timeline around 17.880
and so on.

nrudakov · June 22, 2021, 11:35am

Logs from Nsight Systems are in the previous message

AakankshaS · August 26, 2021, 11:48am

Hi @nrudakov ,
Apologies for the delay, are you still facing the issue.

nrudakov · August 26, 2021, 11:58am

Hi @AakankshaS ,
Yes, the problem still exists even in the latest cudnn (8.2.2). I suspect that the cudnn 8 is causing those gaps. Cudnn 7.6.5 works fine. Unfortunately, it is impossible to use cudnn 7 on Ampere GPUs.

AakankshaS · August 26, 2021, 12:02pm

Hi @nrudakov ,
thank you for confirming,
Can you please share with us the logs, looks like the old link got expired.
Apologies for the same.
Thanks!

nrudakov · August 26, 2021, 12:19pm

@AakankshaS
The same link should work again:

Topic		Replies	Views
Cudnn 7.3 has poor performance on GeForce RTX 2080 cuDNN	0	884	October 12, 2018
Pascal: CUDA 8.0 RC + cuDNN 5.1 unexpectedly slow Other Tools	0	1604	September 16, 2016
pb:tensorflow-gpu with cuda 7.5 and cudnn 4 is faster then tensorflow-gpu cuda8 and cudnn 6 Jetson TX2	2	1219	October 18, 2021
Wrong cuDNN version in download link for cuDNN v5.1 (Jan 20, 2017) - CUDA 8.0 GPU-Accelerated Libraries	1	1880	April 28, 2017
cudnnCreate is taking way too long (>4 mins) - Titan V cuDNN	3	1357	April 26, 2018
cuDNN8: extreamly slow first iteration of CNN training or inference cuDNN	3	1783	December 30, 2021
cuDNN v2: Higher Performance for Deep Learning on GPUs Technical Blog	2	486	November 18, 2015
cuDNN v7.0 (August 3, 2017), for CUDA 8.0 on Power GPU-Accelerated Libraries	3	2218	December 17, 2017
cuDNN8 regression in algorithm selection heuristics cuDNN	6	2811	April 24, 2021
CuDNN compatibility with Nvidia drivers cuDNN	0	677	October 11, 2018

cuDNN 8.x.x vs cuDNN 7.6.5 performance drop

Related topics