Why the utilization from kernel activity records is not equal to GPU Utilization?

SanghoYeo · June 10, 2024, 2:33pm

In NVML, GPU utilization is defined as follows:

typedef struct nvmlUtilization_st
{
unsigned int gpu; //!< Percent of time over the past second during which one or more kernels was executing on the GPU
unsigned int memory; //!< Percent of time over the past second during which global (device) memory was being read or written
} nvmlUtilization_t;

Based on the above definition, I thought I could derive GPU utilization using CUPTI kernel activity records by checking the kernel active time and idle time at regular intervals.

However, GPU utilization derived from CUPTI kernel activity records is always lower than the GPU utilization reported by NVML, as shown in the following figure. In this figure, the orange bar represents CUPTI, and the gray bar represents NVML…

This difference becomes more severe when multiple processes use the same GPU device. For example, in the case where two GPU processes are training cifar10-efficientnet B0 with a batch size of 2 (which is the first experiment case in the above graph), CUPTI shows 50% GPU utilization(similar to two times of ratios in first experiment case), while NVML shows 90% GPU utilization.

From multiple tests, I have concluded that GPU utilization derived from kernel activity records cannot accurately reflect GPU utilization as reported by NVML. Instead, GPU Utilization in NVML appears to be more closely aligned with GR_Engine Active which is basically not derived from kernel execution time.

However, I am still not completely certain about this conclusion. Could you please help me verify this issue?

SanghoYeo · June 26, 2024, 2:22pm

I confirmed that when utilizing the queued times and completion timestamps through cuptiActivityLatencyTimestamps(1), the results are similar to the process utilization collected via NVML. However, it is unclear whether this truly reflects the criteria NVML uses to determine the active state of the kernel.

Topic		Replies	Views
Questions on per-process GPU utilization System Management and Monitoring (NVML)	6	2398	October 30, 2023
Getting information about CUDA kernel executions from another process CUPTI – CUDA Profiler Tools Interface cuda	7	360	August 12, 2024
GPU utilization DGX User Forum	8	6478	August 21, 2019
How is GPU utilisation calculated (as returned by nvmlDeviceGetUtilizationRates(...))? System Management and Monitoring (NVML)	0	580	March 4, 2020
Getting GPU utilization for compute kernels only? System Management and Monitoring (NVML)	0	688	July 16, 2019
How the gpu utilization calculate in NVML System Management and Monitoring (NVML)	1	3235	May 28, 2015
The GPU concurrentcy and how to monitor GPU utilization. The nvidia-smi tool always show two utilization 0 or 100%. CUDA Programming and Performance	2	3407	June 1, 2017
per-process resource accounting CUDA Programming and Performance	2	2693	December 22, 2022
Launching multiple kernels in same context vs multiple kernels CUDA Programming and Performance	5	4444	April 3, 2024
Questions about nvidia-smi CUDA Programming and Performance	2	2046	February 23, 2011

Why the utilization from kernel activity records is not equal to GPU Utilization?

Related topics