Hardware support for INT8 precision

Hi,

I came across multiple threads where it was mentioned that TX2 does not have hardware support for 8 bit computation. What does this mean from a hardware point of view? What is missing in the TX2, that if it were available, would allow 8 bit computation?

I was under the impression that if we have a 32 or 64 bit FPU, we could pack multiple 8 bit values together for computation. This should be possible from the software side. To me, 8 bit computation seems to be a software hack rather than a hardware feature.What am i missing here?

I would like to do inference of neural network models on the TX2 and INT8 could potentially give a decent speedup. I am trying to understand why Tx2 does not support INT8.

Thanks in advance!

Best regards,
Ragavendra

Could be get some information from below link.

https://devtalk.nvidia.com/default/topic/1026069
https://devtalk.nvidia.com/default/topic/1032623

Hi,

I have already seen these links.
What does “No hardware support” mean here? What is missing or not available on sm_62 GPU architecture that prevents it from being able to do INT8 computation?

Best regards,
Ragavendra

1 Like

Hi Ragavendra, sm_62 doesn’t have TensorCores in hardware which are used to accelerate DNNs with INT8. Note that you can still program integer operations in CUDA yourself, these are most often used for array indexing and such as opposed to heavy maths, as on sm_62 the ratio of integer ALU is lower versus that of Xavier (sm_72, which is first Jetson to support TensorCores). So on sm_62, it is recommended to use FP16 precision for best inferencing performance.