I would like to use int8 SIMD instructions on programming in CC9.0 (H100). (e.g. VABSDIFF4.U8)
But I don’t know how to find the SIMD instructions available on CC9.0.
Can you please tell me which int8 SIMD instructions are available on CC9.0 or where I can find a list of them?
Here you can find a list for Hopper (compute capability 9.0):
VABSDIFF4
is included.
Other supported int8 instructions are I2F
(Integer-To-Float Conversion) and IDP
/IDP4A
(Integer Dot Product).
And of course the Tensor Cores.
You can also look into the PTX manual
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html
whether they mention a performance degradation for certain sm_xx for a specific PTX instruction.