Why is stall reason metric calculated based on issue active cycles? Doesn’t that make the unit “warp/cycle”, which is not aligned with unit “cycle/inst” of “Warp Cycles Per Issued Instruction” ?
Is this calculation related to the equality between inst_issued and issue_cycle metrics?
So I’m confused about why the X-axis of the histogram does not show cycle per instruction, while the corresponding stall reason metric is actually warp per cycle?
The metric you have circled in red is a ratio. It is calculated as the number of warps stalled per cycle in this state times the total number of cycles. And then, since we don’t issue on every cycle we need to divide by the total number of cycles where an instruction was issued. This gives us the average amount of cycles a warp spends in this state per issued instruction. You could also label the x-axis “Cycles per Issued Instruction” but we use “Cycles per Instruction” for brevity.
I have another question that I don’t quite understand. Why are inst_issued and issue_cycle metrics the same? Isn’t this a multiple issue situation?
CC 2.x (Fermi) - CC 6.x (Pascal) supported dual-issue of instructions per SM sub-partition warp scheduler (SMSP).
CC 7.0 (Volta) - CC 9.0 (Hopper) support single-issue of instructions per SM sub-partition warp scheduler (SMSP).
For dual-issue architectures smsp__inst_issued.avg can be up to 2 x smsp__issue_active.avg. For single-issue architectures the counters will output the same value.
Warps report stall reasons per cycle. For normalization the tools use cycles that a warp issued and instruction vs. instructions issued or executed. Both can be interesting. For CC 7.0 - 9.0 the ratio of issue_active and inst_executed is 1.0 in most cases; however, there are some small cases where more instructions are issued than executed (retired).
Thank you very much. I don’t have any other questions.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.