cuDNN v8 backend API for Convolution

YashasSamaga · August 3, 2020, 7:05pm

The GTC presentation on cuDNN v8 hinted at an open-source C++ API for cuDNN. Where can I find it?

Is there a convolution sample that uses the new backend API? I can’t find any in the cudnn_v8_samples directory. The documentation isn’t detailed enough to guess my way through either.

The GTC cuDNN 8 slide 29 uses INT64 type for UID. The developer guide uses text as UID. Can you please elaborate on what type UID is?
Can UIDs be reused in different operation graphs? Two completely different operation graphs. Is the UID local to each operation graph or holds across all tensors across all operation graphs?
The GTC slide 31 used CUDNN_TYPE_OPERATION but it isn’t there anymore. What should be used instead?
What is CUDNN_ATTR_CONVOLUTION_SPATIAL_DIMS? Is it supposed to be an array of spatial dimensions (HW or DHW) or the number of spatial dimensions (2 or 3 respectively)?
How to set the number of groups for convolution?
I have been trying to guess my way through. cuDNN 8.0.2 is throwing CUDNN_STATUS_BAD_PARAM in line 105 while calling cudnnBackendFinalize on a convolution forward descriptor. I am unable to diagnose the problem. Can you please look into it?

Code: cuda_common.hpp · GitHub

AakankshaS · August 4, 2020, 6:17am

Hi @YashasSamaga,
I have noted your query and checking on this. Please allow me some time.
Meanwhile just wanted to check if you are referring to the same link

Thanks!

YashasSamaga · August 4, 2020, 6:24am

GTC Slides: http://developer.download.nvidia.com/video/gputechconf/gtc/2020/presentations/s21685-cuDNN-v8-New-Advances-in-Deep-Learning-Acceleration-APIs-Optimizations-and-How-to-Tackle-the-Future-Challenges-in-Hardware-and-Software.pdf

Developer Guide: Documentation Archives :: NVIDIA Deep Learning cuDNN Documentation

Doesn’t look like the same link but is probably the same.

YashasSamaga · August 9, 2020, 9:37am

Few more questions:

Will cuDNN always fuse bias addition step with convolution if asked to?
Is it possible to check what operations have been fused in a selected engine?
Frameworks have their own fused kernels for bias, eltwise addition and activations. Prior to cuDNN 8, OpenCV used to use cuDNN’s fused convolution path if available. Otherwise the convolution would be done by cuDNN followed by a single fused kernel that would do bias addition, elementwise operations and activation.

So now if cuDNN 8 chooses an engine where bias addition is not fused with convolution, there would be three operations: cuDNN conv, cuDNN bias addition and end-user’s fused eltwise activation kernel. A faster solution would be: cuDNN conv and fused bias eltwise activation kernel.

How to decide when to use cuDNN to fuse the operations and when to use end-user’s fused kernels?

YashasSamaga · August 12, 2020, 4:44pm

@AakankshaS has there been any progress on this? cuDNN 8 with v7 API is considerably slower than cuDNN 7. I have for now blamed it on the v7 API as the release notes explicitly states that v7 API doesn’t take care of fused convolutions. I am trying to implement with the new backend API but have been struggling to resolve errors. The new API isn’t very developer friendly (very difficult to debug).

AakankshaS · August 20, 2020, 10:10am

Hi @YashasSamaga,
Apologies for the delayed response.
Here are the answer to your queries.

The actual type should be int64, the text is just for easier illustration
Yes, you can reuse UIDs. All operation graphs are independent of each other. We don’t cache UIDs globally.
The recommendation is to call it this way cudnnBackendSetAttribute(opGraph, CUDNN_ATTR_OPERATIONGRAPH_OPS, CUDNN_TYPE_BACKEND_DESCRIPTOR, numOps, ops);
It’s an int64_t value describing the number of spatial dims
Refer to the v8 conv sample, there is a new group dim in the tensor descriptor, so instead of the old [N, C, H, W] we have now:
X: [N,G,C,(D),H,W], with D being optional
W: [G,K,C,T,R,S], with T being optional
Y: [N, G, K, (O), P, Q], with O being optional
One guess may be the group dim is missing from the tensor descriptors, see above
If the tensors are set up according to above and you still see the issues, try adding the following code for setting alphe/beta

if (computeType == CUDNN_DATA_DOUBLE) {
CHECK_CUDNN(cudnnBackendSetAttribute(opDesc, CUDNN_ATTR_OPERATION_CONVOLUTION_FORWARD_ALPHA, CUDNN_TYPE_DOUBLE, 1, &alpha));
CHECK_CUDNN(cudnnBackendSetAttribute(opDesc, CUDNN_ATTR_OPERATION_CONVOLUTION_FORWARD_BETA, CUDNN_TYPE_DOUBLE, 1, &beta));
} else {
float alphaf = float(alpha);
float betaf = float(beta);
CHECK_CUDNN(cudnnBackendSetAttribute(opDesc, CUDNN_ATTR_OPERATION_CONVOLUTION_FORWARD_ALPHA, CUDNN_TYPE_FLOAT, 1, &alphaf));
CHECK_CUDNN(cudnnBackendSetAttribute(opDesc, CUDNN_ATTR_OPERATION_CONVOLUTION_FORWARD_BETA, CUDNN_TYPE_FLOAT, 1, &betaf));
}

Thanks!

YashasSamaga · August 20, 2020, 11:10am

Thank you for the answers. I downloaded the latest libcudnn8-doc_8.0.2.39-1+cuda11.0_amd64.deb from https://developer.nvidia.com/rdp/cudnn-download and installed. I am not able to find the v8 conv sample in the cudnn_samples_v8 directory. There is a no v8 conv sample but there is a sample that uses the v7 API. I did a grep -lr "Backend" in that directory and got no results.

Is the v8 conv sample part of the next release or the debs in the aforementioned link are outdated?

AakankshaS · August 20, 2020, 7:09pm

Hi @YashasSamaga,
Are you following the below link

Thanks!

YashasSamaga · August 21, 2020, 6:37am

Yes. I have also looked at the files in dpkg -x without installing.

Downloaded libcudnn8-doc_8.0.2.39-1+cuda11.0_amd64.deb
dpkg -x libcudnn8-doc_8.0.2.39-1+cuda11.0_amd64.deb tempdir
cd tempdir/src/cudnn_samples_v8

There is no v8 sample but there is a v7 sample. It’s not an installation mistake. The correct files aren’t even there in the package.

You can verify it systematically by extracting the package contents and executing grep -lr "Backend". There is not even a single instance of that word which implies that there is no v8 sample.

I was directly in touch with cudnn@nvidia.com. They said that the v8 conv sample is packaged with the latest cuDNN release. But I am not able to find it.

AakankshaS · August 21, 2020, 8:09am

Hi @YashasSamaga,
I think you have not downloaded the package with samples.

Can you please check that.
Thanks!

YashasSamaga · August 21, 2020, 8:29am

There are three packages:

runtime library
developer library
code samples and user guide

I have installed all three for Ubuntu x86_64 target.

I also downloaded the PPC package out of curiosity and checked its contents. It doesn’t have the v8 sample.

AakankshaS · August 21, 2020, 7:07pm

Hi @YashasSamaga,
The issue has been reported and fix will be available in future releases.
Please stay tuned.

Thanks!

Topic		Replies	Views
Cudnn backend api for fused op cuDNN cudnn	8	2144	September 13, 2021
CONDA ENV compatible NVIDIA driver, cuda, cuddn version cuDNN cuda	2	2970	July 2, 2023
Fuse Operators cuDNN	6	2244	July 21, 2021
cuDNN v6 INT8 convolution failing with CUDNN_STATUS_NOT_SUPPORTED cuDNN	12	5212	March 3, 2020
Cudnn-10.2-linux-x64-v8.1.0.77.tgz requires CUDA 11? cuDNN	3	794	February 5, 2021
Failed to get convolution algorithm. This is probably because cuDNN failed to initialize cuDNN	29	51547	October 12, 2021
Fusion of convolution and BatchNorm cuDNN	4	1913	April 29, 2022
Where's the documentation for cuDNN? cuDNN	6	8711	October 24, 2018
Accelerating Transformers with NVIDIA cuDNN 9 Technical Blog cudnn	2	181	January 12, 2025
CuDnn slow convolution operation cuDNN kernel	1	23	January 31, 2025

cuDNN v8 backend API for Convolution

Related topics