Hi
Following the metric collection guide, I see these numbers related to IPC.
gpu__cycles_active.sum cycle 657,154
gpu__inst_executed.sum (!) n/a
sm__cycles_active.sum cycle 53,309,893
sm__inst_executed.avg.per_cycle_active inst/cycle 3.40
sm__inst_executed.sum inst 181,313,579
smsp__inst_executed.avg.per_cycle_active inst/cycle 0.85
The sm__inst_executed.avg.per_cycle_active
is 4 times the smsp__inst_executed.avg.per_cycle_active
which is correct.
Also, sm__inst_executed.sum/sm__cycles_active.sum
is the same as sm__inst_executed.avg.per_cycle_active
which is correct.
I would like to know why the gpu__
metrics are not correct? I expect that gpu__inst_executed.sum
be the same as SMs*sm__inst_executed.sum
(for 3080 the number of SMs is 68). The same is expected for cycles. That means I will see the total number of instructions executed on the GPU and the total cycles that the GPU was active. Isn’t that correct?