cuDNN dropout backward mistakenly scaling up

I’m building a framework using cuDNN and I noticed in my testing that cuDNN dropout backward is scaling up its backprop errors based on the dropout rate. I believe that this is a mistake in the implementation, and may cause problems with network learning, since the error rate is incorrectly scaled up during training each time a tensor back propagates through a dropout layer. This scaling up in the backward pass disagrees with the documentation:

(BTW, I understand that the signal is intended to be scaled up during the forward pass so that signal levels remain similar during training and inference; I think that scaling during the backward pass is an issue.)

Provide details on the platforms you are using:
o Linux distro and version
o GPU type
o Nvidia driver version
o CUDA version
o CUDNN version
o Tensorflow and PyTorch version

If possible, please share the script & model file to reproduce the issue along with error info.


I’ve created a small testcase that shows the behavior:

to run:
tar -xvf dropoutBackpropTestcase.tar
cd dropoutBackpropTestcase

However you may need to adjust the cuda and cudnn paths in the makefile.

also I am using a TitanRTX, ubuntu 18.04, cuda 10.2 and the cudnn corresponding to cuda10.2.

After thinking about this further, I believe that the behavior of cudnn dropout backward is correct. I think that my confusion was just based on the documentation. Sorry for the confusion.