NVidia Profiler Output. Which PTX instructions are in each output category ?

NVidia Profiler Output. Which PTX instructions are in each output category ?

When you run NVProf, you get dynamic instruction counts# in:
FP insts (single)
FP insts (double)
controlflow_insts
load/store insts
misc insts
int insts
bit-convert insts

Is it possible to know which PTX instructions are in each category ?

For instance, where does “MOV” instruction fall into ? in the above category.
Wish to hear more info as this is going to be used for research.

Thanks

Side remark: I would assume that the nvprof instruction statistics refer to SASS (machine code) instructions. PTX merely provides a virtual instruction set, and many PTX instructions are in fact emulated at the machine code level, meaning they map to a few or even many different machine instructions.

For example, if you were to examine the machine code for a ‘long long int’ division (which is a single PTX instruction), you might find single-precision floating-point instructions, conversion instructions, integer arithmetic instructions, logical instructions, and control flow instructions.

As njuffa said, the profiler is more closely related to SASS than PTX.

A reasonable breakdown of the SASS instructions and the categories they belong to is given in the documentation:

[url]http://docs.nvidia.com/cuda/cuda-binary-utilities/index.html#fermi[/url]
[url]http://docs.nvidia.com/cuda/cuda-binary-utilities/index.html#kepler[/url]
[url]http://docs.nvidia.com/cuda/cuda-binary-utilities/index.html#maxwell[/url]