I tried to change the exponentialAverageFactor when tuning my learning algorithm, and found that it has absolutely no effect. After some debugging, I found out that:
- exponentialAverageFactor is correctly taken into consideration when updating runningMean and runningVariance
- but normalization is not done with runningMean and runningVariance: it is done with newMean and newVariance!
As a consequence, exponentialAverageFactor, runningMean, and runningVariance have no effect at all on the output of batch normalization.
That looks like a severe bug of cuDNN. Anybody can confirm?