Hello,
I have some questions when I use RNN-functions of cudnnAPI.
I want to train a recurrent neural network with outputs that don’t occur at every time step. It is a many-to-one RNN. I have only one label for each sequences.
After the execution of cudnnRNNForwardTraining,
cudnnStatus_t cudnnRNNForwardTraining(
cudnnHandle_t handle,
const cudnnRNNDescriptor_t rnnDesc,
const int seqLength,
const cudnnTensorDescriptor_t *xDesc,
const void *x,
const cudnnTensorDescriptor_t hxDesc,
const void *hx,
const cudnnTensorDescriptor_t cxDesc,
const void *cx,
const cudnnFilterDescriptor_t wDesc,
const void *w,
const cudnnTensorDescriptor_t *yDesc,
void *y,// output
const cudnnTensorDescriptor_t hyDesc,
void *hy,
const cudnnTensorDescriptor_t cyDesc,
void *cy,
void *workspace,
size_t workSpaceSizeInBytes,
void *reserveSpace,
size_t reserveSpaceSizeInBytes);
I get the data pointer to GPU memory associated with the output tensor descriptor yDesc. For each Iterations I get a output. But only one Label can be used.
y=[seqLength, batchsize,hiddenSize];//y dimensions
Label=[batchSize, LabelVectorDimensions];
If I use only the last Iteration of y to continue forward trainning, everything goes well. But by backward propagation,
cudnnStatus_t cudnnRNNBackwardData(
cudnnHandle_t handle,
const cudnnRNNDescriptor_t rnnDesc,
const int seqLength,
const cudnnTensorDescriptor_t *yDesc,
const void *y,// input
const cudnnTensorDescriptor_t *dyDesc,
const void *dy,// input
const cudnnTensorDescriptor_t dhyDesc,
const void *dhy,
const cudnnTensorDescriptor_t dcyDesc,
const void *dcy,
const cudnnFilterDescriptor_t wDesc,
const void *w,
const cudnnTensorDescriptor_t hxDesc,
const void *hx,
const cudnnTensorDescriptor_t cxDesc,
const void *cx,
const cudnnTensorDescriptor_t *dxDesc,
void *dx,
const cudnnTensorDescriptor_t dhxDesc,
void *dhx,
const cudnnTensorDescriptor_t dcxDesc,
void *dcx,
void *workspace,
size_t workSpaceSizeInBytes,
const void *reserveSpace,
size_t reserveSpaceSizeInBytes);
y and dy as input for the cudnnbackwardtraining should have the same dimensions. In NVIDIA-Docs y is the data pointer caculated by cudnnfrowardtraing. But in my Program the dimensions of y and dy are:
y=[seqLength, batchsize,hiddenSize];
dy=[batchsize,hiddenSize,1]
In order to make y and dy have same dimensions, what should I do?
Thanks in advance!