Question for sm__elapsed_cycles_sum

Hi,

As per the Perfwork API description

The total count of the number of cycles within a range for a sm unit instance.

Is that range, meaning the duration of the kernel? And am i right to say that,

if i have the sm__active_warps_sum and sm__elapsed_cycles_sum for a kernel

Dividing the active by the elapsed, can give me the idea of avg active warps over elapsed cycles for a sm unit by this kernel ?

Thanks in advance.

Yes. The range is the the duration of the kernel. There are elapsed cycle counters in each clock domain. sm__active_cycles_sum will be less than sm__elapsed_cycles due to launch overhead and if any SMs are idle (no active warps) during the capture range.

sm__active_warps_sum / sm__elapsed_cycles_sum is the average number of warps active per SM for the kernel (0-64 Kepler - Volta).

If you are using a GV100 or Turing on Nsight Compute then Nsight Compute will a newer version of Perfworks. The metrics you listed have slightly different names. In addition the metrics system can automatically calculate a number of useful derived metrics.

sm__warps_active.{avg, max, min, sum}
sm__cycles_active.{avg, max, min, sum}

If you want the average active warps per active cycle or elapsed cycle you can collect the metric

sm__warps_active.avg.per_cycle_active
sm__warps_active.avg.per_cycle_elapsed

If you want this normalized to 0-100% you can use the metric name

sm__warps_active.avg.pct_of_peak_sustained_active
sm__warps_active.avg.pct_of_peak_sustained_elapsed

Thank you for your reply. Great information!