- Is there documentation on all the metrics on the “details” page?
- what is “waves per SM”?
- How can I get the number of SMs used to launch a kernel in nsight compute?
- Is there documentation on all the metrics on the “details” page?
For most metrics, you can see their description when querying the respective metric using ns-nsight-cu-cli (Nsight Compute CLI :: Nsight Compute Documentation)
- what is “waves per SM”?
See https://devblogs.nvidia.com/cuda-pro-tip-minimize-the-tail-effect/
- How can I get the number of SMs used to launch a kernel in nsight compute?
Since blocks will be scheduled on different SMs if possible, all SMs will be active if the number of blocks >= number of SMs. Otherwise, it will be the number of blocks. In your application, this might vary due to concurrent kernel execution.
1 Like