[cuDNN] Backpropegation & diffData

I am somewhat confused about the description of diffData for cudnnConvolutionBackwardData:

“Data pointer to GPU memory associated with the input differential tensor descriptor diffDesc.”

As i understand it i feed the gradData resulting from cudnnConvolutionBackwardData or cudnnActivationBackward from the above layer into the current layer in the form of diffDesc. In this manner errors are propagated down. And for each layer weight derivatives (filter & bias) can be computed.

However what should the diffData be for the top most layer? Currently i am simply using the difference between the labels and the forward propagated output values. But is this the correct thing to do?

Also cudnnConvolutionBackwardData only makes use of the filter Data. However i am also using a bias term. Should this bias term not be included somehow before back propagating further?

Anyone?

This is usually called the delta (or difference) function for your ‘cost function’. It is up to you to decide what cost function to use, and it typically depends on your application. Common examples of cost functions are mean-squared error, logistic error, softmax error, etc. It can also be a significantly more complex and application-specific function such as edit distance.

See this link for an example of how to compute the delta (difference) function for a simple mean-squared error cost function. http://ufldl.stanford.edu/tutorial/supervised/MultiLayerNeuralNetworks/ . It’s up to you to generalize the idea if you want to use a different cost function.