Hi,
I’m using cudnnReduceTensor to calculate channel-wise average of the input. To do that I set alpha to 1/n and cudnn reduce descriptor to ADD. My input and output tensors are set to NHWC format. Everything is in float. But cudnnReduceTensor treats the input array in NCHW format anyway. I’m using CuDNN (v8201). Is this a bug or something else?