Metrics meaning in Nsight compute

Hi!
I want to understand some metrics in Nsight compute occupy calculator.


I just create one block with 12 threads (my_kernel<<<1 , 12>>>) But when I see “active thread per multiprocessor” , I see the number of active thread is 255! What is the difference between these two terminology? Is the same as the difference between the hardware thread and software thread?

Hi, @lzxdjb

There are different things.

The 12 thread is from your app. The Active Threads per Multiprocessor(256) is calculate by 32*Active Warps per Multiprocessor(8).

You can launch 256 threads per Multiprocessor but you didn’t launch that much in your application.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.