Tensor Descriptors and Computational Graph in cuDNN

Hi everyone,
I am currently using cuDNN to develop a basic CNN architecture for a project. This includes convolutional, relu, maxpool, fully-connected and softmax activation layers. I am stuck with the idea of tensor descriptors to be passed to the forward and backward routines of cuDNN.

  • What do the descriptors really store? Do they just provide the shape information NCHW for the input (or output float* buffer whichever they are describing!)?
  • Do the tensor descriptors point to a location in the device memory for the buffers? What is their purpose considering the memory perspective?

My second question is on computational graphs on cuDNN.

  • Does cuDNN even make a computational graph for gradient flow for the entire network model? Or does it just store local buffer values and computes gradients with what is passed to the backward functions i.e., cuDNN is just a bunch of independent compute functions?
  • If I change the input/ output tensor descriptors of consecutive layers, how will that affect my network? For example, consider the simple CONV- RELU pipeline. Is it necessary that the input descriptor of the RELU forward has to be the output descriptor of the CONV forward? Does cuDNN track tensors like that like if they are not connected properly, the computational graph (if there is any) would be incomplete?

Please help me with understanding these as they are bugging me for long.

Tensor descriptor : describes the size, layout, and datatype of the x (input) tensor. https://docs.nvidia.com/deeplearning/sdk/cudnn-developer-guide/index.html#tensor-descriptor

“The cuDNN Library exposes a Host API but assumes that for operations using the GPU, the necessary data is directly accessible from the device.” – They are likely passed to the device at some point. Check out this section on CUDNN’s programming model: https://docs.nvidia.com/deeplearning/sdk/cudnn-developer-guide/index.html#programming-model

Please refer below link:

In general the current layer’s output descriptor should match the next layer’s input descriptor. But it’s probably possible to edit the tensor in between layers (think if you implemented some custom op or something in between) and have a scenario where they don’t necessarily match.