cuDNN Batch Normalization - Can I separate operations?

t3l · May 23, 2016, 4:20pm

I am trying to get the cuDNN batch normalization play along nicely with our framework that already features a CPU implementation.

In particular, I am currently working on fetching the data together to satisfy the interface’s needs. Your interface design makes this harder than it has to be.

BN consists of 3 easily separable operations that have nothing to do with each other.

Compute mean & variance per channel and normalize.
Multiply gamma
Add bias

To me, it is rather limiting that I am forced to execute all three operations during FPROP and BPROP. This is even more confusing because it kind of breaks with established practices, since you also did not hardwire Convolution and AddTensor (add bias) together. Steps 2+3 could easily be realized manually using OpTensor. So why push everything into the same function call? Sometimes I just want to normalize but not multiply gamma and add bias. Is there any way to separately perform FPROP and BPROP only for step 1?

njuffa · May 23, 2016, 6:03pm

This is outside my area of expertise, but an educated guess is that multiplying gamma and adding bias are essential free in terms of performance in the context of step (1), so it may make sense to simply roll these operation in by default and let programmers select gamma of 1.0 and bias of 0.0 if this part of the functionality is not needed.

t3l · May 24, 2016, 1:53am

Considering my experience with ConvolutionBackwardBias, I beg to differ. That one can take quite long in some cases and it just broadcasts and adds stuff. However your idea will definitely work. But it also implies that I would have to allocate dummy memory for gradients during the backward phase. It would be nice if that could somehow be avoided.

njuffa · May 24, 2016, 2:15am

As I stated, I offered only an “educated guess”. I think the folks familiar with cuDNN check this forum occasionally, so there is a chance you will get an authoritative reply regarding this particular design question.

Topic		Replies	Views
Fusion of convolution and BatchNorm cuDNN	4	1970	April 29, 2022
ConvNet Batch Normalization with cudNN GPU-Accelerated Libraries	0	1221	April 17, 2017
cudnnBatchNormalizationForwardTraining Results in batchNormOutputTensor with Same Large Negative Double cuDNN	2	2257	February 3, 2020
CUDA Parallel Convolution Scheduling Issues(cuDNN) cuDNN kernel , cudnn	2	81	April 29, 2025
code samples for BatchNormalizationForwardInference cuDNN	2	854	October 12, 2018
Cudnn backend api for fused op cuDNN cudnn	8	2165	September 13, 2021
Fuse Operators cuDNN	6	2345	July 21, 2021
How to use cudnn backend to train a cnn network with relu or bn layers? cuDNN	8	890	July 22, 2021
CUDNN Batchnorm Backward result is not correct? Found big difference than CPU result. cuDNN	2	813	October 24, 2018
cuDNN update backend graph arguments / support for indirection pointers cuDNN	1	692	June 24, 2021

cuDNN Batch Normalization - Can I separate operations?

Related topics