Small number of thread instructions

For a kernel, I see that the number of threads is 512 while the number of instructions (smsp*.sum) is too small. Is that normal? Does that mean some threads had no instructions to execute? Any idea about that?

Yes, that’s possible, not every launched thread necessarily executes the same number of instructions. It can be helpful to collect the SourceCounters section and inspect the Instructions Executed and Predicated-On Thread Instructions Executed on the Source page to see precisely which SASS instructions are executed how often.

smsp__inst_executed is the number of warp instructions executed. The results show a kernel with 1 thread block of 512 threads = 16 warps. The average warp executed 10 instructions.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.