what is IPC(Instructions Per Cycle)?

fishseeker · October 13, 2018, 8:58am

What is IPC, the explanation in the documentation is Instructions execution per cycle, but what is the instructions here, does cycle refer to the cycle of SM, can IPC represent the running time of a program, why my 1080ti MAX IPC is 6 ?

Greg · October 13, 2018, 7:02pm

The CUPTI/nvprof metric ipc is defined as

ipc = inst_executed / active_cycles

inst_executed is the number of warp instructions (not threads) retired by an SM.

active_cycles is the number of SM cycles the SM had at least one active warp.

elapsed_cycles is the number of SM cycles during the PM collection period.

The ipc metric is best used as a throughput metric to determine if the application is compute bound.

On Maxwell and Pascal architecture the SM has 4 sub-partitions. Each sub-partition has a warp scheduler. Each warp scheduler can issue 2 instructions per cycle. The maximum theoretical ipc value per cycle is 8. However, the instruction fetch rate and the maximum rate of variable latency instructions (load, store, transcendentals, …) limits the sustained rate to 6 instructions per cycle.

ipc cannot be used to represent running time. ipc is the instruction throughput over a period of collection. A FP32 bound kernel can achieve close to 4 IPC per SM. A memory bound kernel is more likely to have < 1 IPC per SM.

CUPTI and nvprof are moving to using a new version of PerfWorks metrics. The PerfWorks metrics (available in the new Nsight Compute profiler and Nsight VSE CUDA profiler) are

PerfWorks Metrics (< v1, Kepler-Volta)
sm__inst_executed_{avg, max, min, sum}
sm__inst_executed_{avg, sum}per{active, elapsed}cycle
sm__inst_executed_per{active, elapsed}_cycle_sol_pct

PerfWorks Metrics (> v1, Nsight 6.0 Turing)
sm__inst_executed.{avg, max, min, sum}
sm__inst_executed.{avg, max, min, sum}.per_cycle_{active, elapsed}
sm__inst_executed.{avg, max, min, sum}.pct_peak_sustained_per_{active, elapsed}

fishseeker · October 15, 2018, 7:54am

I don`t know how to @ somebody but thanks a lot @(Greg@NV)

Topic		Replies	Views
how to calculate theoretical fp32 instructions per cycle (IPC) on nvidia GPU CUDA Programming and Performance	6	5462	July 9, 2017
What can be learned from IPC (via nvprof)? CUDA Programming and Performance	9	3184	July 13, 2018
What is MAX IPC which is shown in properties view for device? CUDA Programming and Performance	4	1113	June 20, 2017
nvprof active_cycles vs elapsed_cycles_sm CUDA Programming and Performance	3	2543	August 27, 2016
On Max IPC, IPC, compute utilisation for the P100 CUDA Programming and Performance cuda , performance	4	1033	October 11, 2020
Max IPC of 3080 CUDA Programming and Performance	4	669	October 12, 2021
IPC at device level Nsight Compute	3	642	October 26, 2021
About instruction per warp metric CUDA Programming and Performance	6	813	October 12, 2021
need clarity in definition of inst_per_warp CUDA Programming and Performance	1	627	June 24, 2017
What is gpc__cycles_elapsed.max? Nsight Compute	1	1968	January 21, 2022

what is IPC(Instructions Per Cycle)?

Related topics