memory metric reported by NVML and nvdia-smi seems to differ

sgambient · May 24, 2019, 12:52am

Hi, have a simple question on util.memory (reported by nvmlDeviceGetUtilizationRates) and nvdia-smi. I am seeing the one reported by NVML to be 40% where as nvidia-smi shows 6715MiB / 7982MiB . Why are these 2 numbers different ? Thanks.

sgambient · May 24, 2019, 7:39pm

Anyone able to answer this ?

anon56509511 · May 25, 2019, 7:11am

UTILIZATION is not ALLOCATION

try: “nvidia-smi dmon” or “nvidia-smi -q”

nvmlDeviceGetUtilizationRates() → nvmlUtilization_t →
memory: Percent of time over the past sample period during which global (device) memory was being read or written.
([url]https://docs.nvidia.com/deploy/nvml-api/structnvmlUtilization__t.html#structnvmlUtilization__t[/url])

nvmlDeviceGetMemoryInfo() → nvmlMemory_t →
total: Total installed FB memory (in bytes).
used: Allocated FB memory (in bytes). Note that the driver/GPU always sets aside a small amount of memory for bookkeeping.
([url]https://docs.nvidia.com/deploy/nvml-api/structnvmlMemory__t.html#structnvmlMemory__t[/url])

sgambient · May 26, 2019, 7:43pm

Thanks for the clarification. Utilization essentially refers to % of memory bandwidth used. Usage essentially refers to how much memory has been allocated/reserved.

That makes sense for the situation where OOM happens even when utilization is low ( 40%) , but reservation/allocation is high .

Is there any simple way to trade off the two ( e.g utilization vs allocation ) ?

Topic		Replies	Views
A problem about nvmlMemory_t Struct System Management and Monitoring (NVML)	0	409	March 6, 2020
Nvidia-smi and nvmlDeviceGetUtilizationRates do not match System Management and Monitoring (NVML)	0	979	May 24, 2022
Unified Memory: nvidia-smi "Memory Usage" interpretation CUDA Programming and Performance cuda	6	13586	June 27, 2023
How to monitor SM utilization and SM occupancy? System Management and Monitoring (NVML)	7	10250	January 12, 2024
Questions about nvidia-smi CUDA Programming and Performance	2	2046	February 23, 2011
cudaMemGetInfo returns similar result for 3 different GPUs CUDA Programming and Performance cuda , nvbugs	5	369	January 23, 2024
Understanding memory.used of nvidia-smi System Management and Monitoring (NVML)	2	1295	November 15, 2022
GPU Memory Stats best practices General Topics and Other SDKs opengl , nvidia-smi	0	51	August 7, 2024
nvmlMemory_v2_t: invalid online API documentation on used and reserved memory System Management and Monitoring (NVML)	1	655	June 21, 2023
Why nvidia-smi, nor cudaMemGetInfo do not throw error with over-occupied device memory? CUDA Programming and Performance cuda	6	548	June 8, 2023

memory metric reported by NVML and nvdia-smi seems to differ

Related topics