I want to know the exact meaning of ‘gpc__cycles_elapsed.max’.
What is GPC (Graphic Processing Cluster)?
Also, I found 'dram__cycles_elapsed.avg.per_second [cycle/second] 'and ‘gpc__cycles_elapsed.avg.per_second [cycle/second]’ in metric.
What does ‘gpc__cycles_elapsed.max’, 'dram__cycles_elapsed.avg.per_second [cycle/second] 'and ‘gpc__cycles_elapsed.avg.per_second [cycle/second]’ mean and what is the difference?
If I want to know the number of “GPU” cycles required for one kernel to execute, which of the three should I use? Or is there another metric?
The NVIDIA GPU has logic in many different clock domains.
<unit>__cycles_elapsed is the number of clock cycles between the start counting and stop counting triggers in the 's clock domain.
GPC is the Graphics Processing Cluster. GPCs are one of the building blocks of the NVIDIA GPU. Small GPUs may have only 1 GPC. Larger GPUs have 2, 4, 6, … GPCs. A GPC contains a set of SMs, L1 caches, Texture caches, and graphics pipe.
DRAM is the name for the GPU memory controllers for GDDR or HBM memory.
The GPC clock is the primary clock advertised for a NVIDIA GPU. This may be called the graphics clock, application clock, base clock or boost clock.
The DRAM clock is the base clock for the GDDR or HBM memory.
For performance tuning
gpc__cycles_elapsed.avg or .max are the primary measurement. The elapsed clocks in each GPC can vary slightly so NCU uses
<units>__cycles_elapsed into a clock frequency.
The __elapsed_cycles are used for throughput calculations such as
sm__cycles_active.avg.pct_of_peak_sustained_elapsed = sm__cycles_active.avg / sm__cycles_elapsed.avg.
This is the percentage of cycles that the SM was active during the counter collection period.