Has anyone built a workable (even a toy) cudnn rnn model in C/C++? I tried to extend the example provided with the cudnn library, but it doesn’t seem to work - the weights and outputs don’t get updated over multiple epochs.
I’m trying to do the same here and as far as I know the training from the example was performed thank to caffee.
If you want to train your RNN you need to use cudnnRNNForwardTraining, cudnnRNNBackwardData and cudnnRNNBackwardWeights. That’s all I know, I’m trying hard to build a complete example with LSTM.
Yes, the RNN example that comes with cudnn uses cudnnRNNForwardTraining, cudnnRNNBackwardData and cudnnRNNBackwardWeights - but it seems it lacks a loss function, so inevitably, it needs to be extended. But even so, it looks like there is a major bug in one of these three functions.
So I check a little further and I think that cudnnRNNBackwardData return gradients and cudnnRNNBackwardWeights “accumulates weight gradients dw from the recurrent neural network
described by rnnDesc with inputs x, hx, and outputs y”. However you still have to update your weights manually with learning rates and gradients, no?
In fact, from what I understood, none of these function update weights, they just return the necessary value for the updating process.
Do you agree?
I tried several weight updates and it doesn’t work. w and dw remain constant over epochs and what is even more weird, dw remains 0, even after BackwardWeights.
Without a clear description of each function, we are groping in the dark.