NVidia Profiler Output. Which PTX instructions are in each output category ?
When you run NVProf, you get dynamic instruction counts# in:
FP insts (single)
FP insts (double)
controlflow_insts
load/store insts
misc insts
int insts
bit-convert insts
Is it possible to know which PTX instructions are in each category ?
For instance, where does “MOV” instruction fall into ? in the above category.
Wish to hear more info as this is going to be used for research.
Side remark: I would assume that the nvprof instruction statistics refer to SASS (machine code) instructions. PTX merely provides a virtual instruction set, and many PTX instructions are in fact emulated at the machine code level, meaning they map to a few or even many different machine instructions.
For example, if you were to examine the machine code for a ‘long long int’ division (which is a single PTX instruction), you might find single-precision floating-point instructions, conversion instructions, integer arithmetic instructions, logical instructions, and control flow instructions.