Hi,
I am writing a CNN with cudnn and uses “BatchNormalizationForwardInference”. I did not make it correct but did not find any code samples or references online. I looked into cuDNN code samples but there is no piece about BatchNorm either.
Is there any tutorial or code sample about batch norm?
Thanks
It seems this tutorial (https://docs.nvidia.com/deeplearning/sdk/cudnn-developer-guide/index.html#cudnnBatchNormalizationForwardInference) is a little confusing.
It says:
Note: The input transformation performed by this function is defined as: y := alpha*y + beta *(bnScale * (x-estimatedMean)/sqrt(epsilon + estimatedVariance)+bnBias)
But it also says:
alpha, beta
Inputs. Pointers to scaling factors (in host memory) used to blend the layer output value with prior value in the destination tensor as follows: dstValue = alpha[0]*resultValue + beta[0]*priorDstValue.
It seems the alpha and beta is contradicting?
In addition, how to choose the mode for “xDesc”, “yDesc” and “bnScaleBiasMeanVarDesc”?
My setting is
checkCUDNN(cudnnCreateTensorDescriptor(&bn_descriptor));
checkCUDNN(cudnnSetTensor4dDescriptor(bn_descriptor,
/*format=*/CUDNN_TENSOR_NCHW,
/*dataType=*/CUDNN_DATA_FLOAT,
/*batch_size=*/batch_size,
/*channels=*/out_channels,
/*image_height=*/1,
checkCUDNN(cudnnCreateTensorDescriptor(&output_descriptor));
checkCUDNN(cudnnSetTensor4dDescriptor(output_descriptor,
/*format=*/CUDNN_TENSOR_NHWC,
/*dataType=*/CUDNN_DATA_FLOAT,
/*batch_size=*/batch_size,
/*channels=*/out_channels,
/*image_height=*/height,
/*image_width=*/width));
/*image_width=*/1));
checkCUDNN(cudnnBatchNormalizationForwardInference(cudnn,
CUDNN_BATCHNORM_SPATIAL,
/*alpha=*/&alpha,
/* beta=*/&beta,
output_descriptor,
d_convout,
output_descriptor,
d_convout,
/*bnScaleBiasMeanVarDesc=*/bn_descriptor,
d_bnScale,
d_bnBias,
d_estimatedMean,
d_estimatedVariance,
/*epsilon*/CUDNN_BN_MIN_EPSILON));
in which, output_descriptor is used for “xDesc”, “yDesc” and the descriptor for output of convolution. bn_descriptor is used for “bnScaleBiasMeanVarDesc”. Their format is ‘CUDNN_TENSOR_NHWC’ and ‘CUDNN_TENSOR_NCHW’. Is there problem with it?