Do you have any idea about INT8 support Tegra X2?
And what about the GV100? I found it surprisingly difficult to find a clear communication from NVIDIA about this subject?
I am not familiar with the Tegra TX2. You could try asking in the dedicated Tegra X2 forum “next door”. I likewise know close to nothing about GV100 other than that it exists. GV100 is a supercomputer class part, not anything I am likely to use any time soon.
At this time, INT8 support is available on cc6.1 and cc7.0 compute capability devices
GV100 = cc7.0 (INT8 is supported)
TX2 = cc6.2 (INT8 not supported)
a simple google search on “INT8 TX2” turns up the TX2 information readily.
Note that when using code optimized for TensorCore (on GV100) the FP16 throughput (peak, theoretical) is higher than the INT8 throughput (peak, theoretical).
INT8 throughput (peak, theoretical) on GV100 is ~4x the FP32 throughput (so 4x15Top/s = 60 Top/s) whereas the peak theoretical FP16 multiply/accumulate throughput for matrix multiply ops on TensorCore is 120TFlops