Which ptx command will lower to VIADDMNMX instruction?

Hi, I use Hopper Architecture and use instrinsic “__viaddmax_s32”, the ptx as well as sass lowered to are two instructions.
So my question is how to leverage VIADDMNMX(SIMD Integer Addition and Fused Min/Max Comparison)?

The CUDA Binary Utilities documentation confirms that there is a VIADDMNMX instruction provided by Hopper.

I cannot find any corresponding PTX instruction in the latest PTX manual (Parallel Thread Execution ISA Version 8.1). Whether it is intended to expose this functionality at PTX level only NVIDIA can say. Given the trouble with exposing previous SIMD instructions at the PTX level (most of which later had to be emulated) it seems possible and plausible that the task of generating VIADDMNMX may have been relegated to a peephole optimization performed by ptxas.

(1) Are you using the latest CUDA tool chain (CUDA Toolkit 12.1 Update 1)?
(2) Is the code being compiled with full ptxas optimizations (-Xptxas -O3 is the nvcc default, but may be overridden by programmer provided commandline flag)?
(3) Is the code being compiled for the correct architecture (sm_90)?

If the answer to all three of the above is “yes”, you may want to file a bug report / feature request with NVIDIA.

1 Like

Thank you,I will try compiling with full ptxas optimizations