cuDNN: How is the notion of the tensor dimensions preserved in Nd layouts?


There are two kinds of tensors and convolutions in cuDNN. However, the documentation tells little about how the notions of “number of samples” (N parameter) of “channels” (C parameter) and “number of maps” (K parameter in cuDNN paper, convolution[NCHW, K] = NKHW) is preserved in Nd layouts.

Considering that I am running a 2D convolution on 4D tensors: In 4D tensors the dimensions have an implied meaning. So the engine knows that the kernel’s H and W dimensions have to be mapped to H and W positions in the tensor first. Then the kernel has to be slided along the C dimension of the input. Hence, the engine knows that the offsets [0, H * W[ of the weight buffer correspond to C_0, offsets [H * W, 2 * H * W[ correspond to C_1, and so on. Furthermore, it knows that the N dimension breaks the “mini batch” input into “samples”. How is this notion preserved in the Nd scenarios?

P.S.: An example with tensor parameters for 3D convolution kernels on 5D inputs would really be helpful. (cudnnSetTensorNdDescriptor = ?, cudnnSetFilterNdDescriptor = ?, cudnnSetConvolutionNdDescriptor = ?)

Many thanks in advance,


I played around with the Nd functions for several hours. While the cudnnSetTensorNdDescriptor and cudnnSetFilterNdDescriptor can easily be bent to accept any format you desire, the cudnnSetConvolutionNdDescriptor-function does not. It seems to imply a certain format of the participating tensors/filters. (i.e.: dim[0] = number of samples, dim[1] = number of channels). Can anybody confirm whether this assessment is correct?

Just for clarification. My understanding is that the following relationship is hard wired within the library. Is this statement true?

TensorNdDescriptor(N, C, shape) => e.g. TensorNdDescriptor(N, C, Z, Y, X)

FilterNdDescriptor(K, C, shape) => e.g. FilterNdDescriptor(K, C, Z, Y, X)

ConvolutionNdDescriptor(shape-dimensional strides and padding) => e.g. ConvolutionNdDescriptor(Z_stride, Y_stride, X_stride)

You are correct, cuDNN assumes that for convolution the 2 first dimensions of the tensors are N, C respectively

Hello, I have a basic question.

If TensorNdDescriptor(N, C), the organization of data array should be x00, x10, … ,x(N-1)0, x01, x11, … , X(N-1)1, … , x0(C-1), x1(C-1), … , X(N-1)(C-1). Is that right?

Thank you very much