How to use cudnn backend to train a cnn network with relu or bn layers?

896849432 · July 21, 2021, 4:08am

recently, we a trying to train a mobilenet v1 network with cudnn backend
but we found 2 problems :

we could not find explicit support for batchnormalization layer. the only way we found is pointwise
we could not find explicit support for relu backward, we could not acquire mask of relu fwd via any option

AakankshaS · July 21, 2021, 7:21am

Hi @896849432 ,
Can you please refer to the below links and see if they helps?API Reference :: NVIDIA Deep Learning cuDNN Documentation
API Reference :: NVIDIA Deep Learning cuDNN Documentation

Thanks!

896849432 · July 21, 2021, 7:29am

hi,
i have scanned the docs, but how can i put these bn opts into operation graph of cudnn-backend?

yanxu · July 21, 2021, 7:31pm

Hi @896849432 , thanks for the questions

currently you would need to use the legacy API for BN. It has not been added to the v8 backend API yet
From the API it’s possible to add relu_backward into the operation graph, it requires the original input of the relu fwd to be passed in where the mask is re-computed from the tensor. See the sample code below

CHECK_ERROR(cudnnBackendCreateDescriptor(CUDNN_BACKEND_OPERATION_POINTWISE_DESCRIPTOR, &opDesc));
CHECK_ERROR(cudnnBackendSetAttribute(opDesc, CUDNN_ATTR_OPERATION_POINTWISE_PW_DESCRIPTOR, CUDNN_TYPE_BACKEND_DESCRIPTOR, 1, &(this->pwDesc)));

CHECK_ERROR(cudnnBackendSetAttribute(opDesc, CUDNN_ATTR_OPERATION_POINTWISE_DXDESC, CUDNN_TYPE_BACKEND_DESCRIPTOR, 1, &(dX->Bdesc)));
CHECK_ERROR(cudnnBackendSetAttribute(opDesc, CUDNN_ATTR_OPERATION_POINTWISE_DYDESC, CUDNN_TYPE_BACKEND_DESCRIPTOR, 1, &(dY->Bdesc)));
CHECK_ERROR(cudnnBackendSetAttribute(opDesc, CUDNN_ATTR_OPERATION_POINTWISE_XDESC, CUDNN_TYPE_BACKEND_DESCRIPTOR, 1, &(X->Bdesc)));
CHECK_ERROR(cudnnBackendSetAttribute(opDesc, CUDNN_ATTR_OPERATION_POINTWISE_ALPHA1, CUDNN_TYPE_DOUBLE, 1, &(this->alpha1) /*has to be 1.0 for fusion*/));
CHECK_ERROR(cudnnBackendSetAttribute(opDesc, CUDNN_ATTR_OPERATION_POINTWISE_ALPHA2, CUDNN_TYPE_DOUBLE, 1, &(this->alpha2) /*has to be 1.0 for fusion*/));

CHECK_ERROR(cudnnBackendFinalize(opDesc));

However for mobilenet currently we are not able to fuse dep-sep convolutions with relu, so you would need to create a single-operation graph for convolution, and cannot add the relu in it. for activation for now you will have to use the legacy cudnnActivationForward/Backward API.

896849432 · July 22, 2021, 5:42am

thanks a lot for your answering

896849432 · July 22, 2021, 5:53am

thanks a lot for your answering,
is there any way to create a custom operator (in this operator, i could call bn legacy api) like custom plugin in tensorrt?

yanxu · July 22, 2021, 7:37am

Hi @896849432 how do you plan to use cuDNN, is it through DL frameworks or through your own code?
We currently don’t support customized ops like what you have described. Would you be able to change the upper level code to call into the legacy API for the BN nodes in the graph?

My understanding is TRT is more like a DL framework and sees the global graph, and tries to do global optimizations on the graph partitions, that’s why users may need plug-ins to run customized code for certain parts of the graph. For cuDNN since we don’t target global graph optimizations and expect the caller to do the graph partitioning and decide where to lower each part of the graph to, we haven’t thought about supporting plug-ins for a subgraph like that. It would be very interesting to hear about your use cases.

896849432 · July 22, 2021, 7:46am

thanks ,
currently, we use cudnn by calling api for each layer,
and we were trying to speedup our trainning via cudnn backend
we used to assume that cudnn backend worked like trt, so we suppose that we can speedup whole training if we transform whole or most part of the network to an opt-graph
and now, we are going to give up usage of backend and adapt to different algos of cudnnConv instead

yanxu · July 22, 2021, 8:52am

Hi @896849432 if you can provide more details of what your graph looks like, we might be able to provide more suggestions of how you can optimize the graph. There are operation graph patterns that can be fused though the backend API to speed up the training, just it doesn’t support arbitrary fusion.

Topic		Replies	Views
Does cudnnFusedOps_t not support backward fusion? cuDNN	1	750	June 4, 2022
Fusion of convolution and BatchNorm cuDNN	4	2152	April 29, 2022
Fuse Operators cuDNN	5	2662	March 31, 2021
Just Released: NVIDIA cuDNN 9.7 Technical Blog cudnn	0	141	January 31, 2025
How to understand cudnnFusedOps_t (6.1.3.1) in the in cudnn8 doc cuDNN	0	464	August 9, 2021
How use the cudnn graph API for do a convolution cuDNN cudnn	2	259	February 15, 2025
Cudnn conv+bias fusion using backend cuDNN	1	834	March 31, 2023
cuDNN v8 backend API for Convolution cuDNN	11	2099	August 21, 2020
cuDNN update backend graph arguments / support for indirection pointers cuDNN	1	759	June 24, 2021
Question regarding fusion engine in cuDNN frontend library cuDNN	1	1835	July 1, 2021

How to use cudnn backend to train a cnn network with relu or bn layers?

Related topics