memory metric reported by NVML and nvdia-smi seems to differ

Hi, have a simple question on util.memory (reported by nvmlDeviceGetUtilizationRates) and nvdia-smi. I am seeing the one reported by NVML to be 40% where as nvidia-smi shows 6715MiB / 7982MiB . Why are these 2 numbers different ? Thanks.

Anyone able to answer this ?

UTILIZATION is not ALLOCATION

try: “nvidia-smi dmon” or “nvidia-smi -q”

nvmlDeviceGetUtilizationRates() -> nvmlUtilization_t ->
memory: Percent of time over the past sample period during which global (device) memory was being read or written.
(https://docs.nvidia.com/deploy/nvml-api/structnvmlUtilization__t.html#structnvmlUtilization__t)

nvmlDeviceGetMemoryInfo() -> nvmlMemory_t ->
total: Total installed FB memory (in bytes).
used: Allocated FB memory (in bytes). Note that the driver/GPU always sets aside a small amount of memory for bookkeeping.
(https://docs.nvidia.com/deploy/nvml-api/structnvmlMemory__t.html#structnvmlMemory__t)

Thanks for the clarification. Utilization essentially refers to % of memory bandwidth used. Usage essentially refers to how much memory has been allocated/reserved.

That makes sense for the situation where OOM happens even when utilization is low ( 40%) , but reservation/allocation is high .

Is there any simple way to trade off the two ( e.g utilization vs allocation ) ?