cudnnRNNForward() issue

Hello,

I am trying to get the “cudnnRNNForward()” function to work. Unfortunately for me it does not work. The only parameters with which I managed to make it work are :

inputSize == hiddenSize == projSize == batchSize == numLayers == maxSeqLenght == verctorSize == 1

If I pass these parameters to == 2 the function returns bad param.

At first I thought I was wrong and that I was inverting the sizes of dimensions in the parameters of RNNDataDesc/RNNDesc. But with equal size dimensions as presented before (all at 2) this can’t be the case and so I can’t see where my problem comes from.

Would a kind soul be willing to help me?

Cuda version : 11.7
Graphics card : NVIDIA GeForce GTX 1050
Cudnn version : 8.0

Hi,

Hope the following doc will help you. Also, we recommend you to use the latest cuDNN version.

Thank you.

Hi,

Thank you for your answer. Of course I have already used the documentation to realize the function. Do you have a simple example that works?

Thank you.

Hi,

I think we do not have specific samples. Only the following are available.

Please share with us the issue repro script and error logs for better help.

Thank you.

Hello,

Here is the code of the RNN forward function. I tried to get the meaning of the BAD_PARAM error (I activated the error displays with the environment variables see: Developer Guide :: NVIDIA Deep Learning cuDNN Documentation) only, as you can see on the attached image I don’t get more details.

Thanks

cudnn_rnn_test.zip (15.8 MB)

Hi,

At the parameters of the cudnnRNNForward() call, We are trying to pass 1x1 matrices and require TENSOR_OP_MATH Natively, TENSOR_OP_MATH will not work with any matrix size.

Your program works on Ampere (GeForce RTX 3090) but on older chips cuBLAS may refuse to handle 1x1 matrices with TENSOR_OP_MATH, Besides, if we are correct, Pascal has no Tensor Cores acceleration. As you use GTX 1050 that has the GP107 chip with 5 or 6 SM-s.

i!   inputSize: type=int; val=1;
i!   hiddenSize: type=int; val=1;
i!   projSize: type=int; val=1;
i!   numLayers: type=int; val=1;
i!   dropoutDesc: type=cudnnDropoutDescriptor_t:
i!   seed: type=size_t; val=0;
i!   dropout: type=float; val=0.5;
i!   nstates: type=int; val=62976;
i!   inputMode: type=cudnnRNNInputMode_t; val=CUDNN_LINEAR_INPUT (0);
i!   bidirectional: type=cudnnDirectionMode_t; val=CUDNN_UNIDIRECTIONAL
i!   dataType: type=cudnnDataType_t; val=CUDNN_DATA_FLOAT (0);
i!   mathPrec: type=cudnnDataType_t; val=CUDNN_DATA_FLOAT (0);
i!   mathMode: type=cudnnMathType_t; val=CUDNN_TENSOR_OP_MATH (1);
i!   auxFlags: type=unsigned; val=CUDNN_RNN_PADDED_IO_DISABLED (0x0);

Could you please switch to CUDNN_DEFAULT_MATH?

CUDNN_CALL(cudnnSetRNNDescriptor_v8(RNNDesc, // cudnnRNNDescriptor_t
CUDNN_RNN_ALGO_STANDARD, // cudnnRNNAlgo_t
CUDNN_LSTM, // cudnnRNNMode_t
CUDNN_RNN_NO_BIAS, //cudnnRNNBiasMode_t
CUDNN_UNIDIRECTIONAL, //cudnnDirectionMode_t
CUDNN_LINEAR_INPUT, //cudnnRNNInputMode_t
CUDNN_DATA_FLOAT, //cudnnDataType_t
CUDNN_DATA_FLOAT, //cudnnDataType_t
CUDNN_DEFAULT_MATH, // instead of CUDNN_TENSOR_OP_MATH, //cudnnMathType_t
inp_vector, //inputSize
out_vector, //hiddenSize
out_vector, //projSize
1, //numLayers
DropoutDesc, //cudnnDropoutDescriptor_t
0)); //auxFlags

If you still face an issue, we recommend you to please share with us the issue repro script, steps to run, and complete logs for better debugging.

Thank you.