Has anyone try cudnnDivisiveNormalization() in 16fp?
The results is very different from fp32. Is it a bug?

(I thave tried cudnnSoftmaxForward(), cudnnConvolutionForward(),cudnnLRNCrossChannelForward() in fp16 and fp32. They are all ok)

cudnnDivisiveNormalization() only work in fp16 when lrnN=1.

cudnnStatus_t CUDNNWINAPI cudnnSetLRNDescriptor(
cudnnLRNDescriptor_t normDesc,
unsigned lrnN,
double lrnAlpha,
double lrnBeta,
double lrnK

It turns that fp16 easily causes overflow in LRN within channel normalization layer.
Hence it is not a bug.