I want to calculate the number of active SMs (occupied SMs) during my kernel profiling using nsight-compute, what is an accurate way to do so?
Nsight Compute does not show per SM counter values. The average, minimum, and maximum number of active cycles can be collected with the metrics:
sm__cycles_active.avg sm__cycles_active.max sm__cycles_active.min
If sm__cycles_active.min == 0 then at least 1 SM was idle for the entire grid.
To convert the above metrics to a percentage of the grid execution (divide by sm__cycles_elapsed.avg) collect the metrics:
sm__cycles_active.avg.pct_of_peak_sustained_elapsed sm__cycles_active.max.pct_of_peak_sustained_elapsed sm__cycles_active.min.pct_of_peak_sustained_elapsed
Thanks for the prompt response.
so to get the average number of active SMs during grid execution, it would be:
sm__cycles_active.avg.pct_of_peak_sustained_elapsed* total no of SMs for a given arch (108 for a100)?