[CUDNN 8.1 conv2d] if we set alpha to 1.0 and set output as INT32, are there type conversions under the hood?

With backend API, we can set input as INT8 and output as INT32.

In this case, does cuDNN still cast the conv’s output to FLOATs and multiply by alpha(which is 1.0), then cast the result to INT32 ?

Another question, in this diagram,

How is the “Clamp” to be done? If I don’t make it wrong, user doesn’t provide any information about this “Clamp”, so what should I expect for the “data output y”?

Hi @ocwins.z ,
cuDNN doesn’t support INT32 output type, see Table 17 for all the supported configurations below API Reference :: NVIDIA Deep Learning cuDNN Documentation

It’s clamping to the casted range and Round to Nearest Integer (nearest even for midways).



But I play around with your examples (conv_sample.cpp from cudnn-frontend), set output to INT32, and it passes.

I mark it as C here, “AFTERCONV_TENSOR” in the source code, it’s conv_op’s output.

Do you mean mapping (min, max) to (-128, 127), or mapping floats with saturation (for example, the range of INT16) to (-128, 127), or something else?

If the activation is RELU, the min values are zeros, and all zeros go to -128 ?

Thank you.

Hi @ocwins.z ,
Apologies for the delay, are you still facing the issue?