Hi everyone,
I have been wondering, is it possible to implement 8-bit quantization in cuDNN, without using TensorRT? I could not see any useful information in the documentation or in code I saw on GitHub. Is it true that there is no support/way for 8-bit quantization in cuDNN without TensorRT?
Thank you so much))