I use nvmlDeviceGetComputeRunningProcesses
to get process memory usage.
result = nvmlDeviceGetHandleByIndex(0, &device);
if (result != NVML_SUCCESS) {
printf("Failed to get handle for device %d: %s\n", i, nvmlErrorString(result));
}
result = nvmlDeviceGetComputeRunningProcesses(device, &infoCount, info);
if (result == NVML_SUCCESS) {
for (unsigned int j = 0; j < infoCount; j++) {
printf(" PID: %u\n", info[j].pid);
printf(" Memory Utilization: %u\n", info[j].usedGpuMemory);
printf("--------------------------\n");
}
}
the result and the nvidia-smi
output are following
PID: 1240406
Memory Utilization: 4120903680
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.08 Driver Version: 535.161.08 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA L4 On | 00000000:65:01.0 Off | 0 |
| N/A 34C P0 27W / 72W | 12130MiB / 23034MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 1240406 C python 12122MiB |
+---------------------------------------------------------------------------------------+
The python is an idle inference.
So what does usedGpuMemory
mean? Total memory usage(but it is different from nvidia-smi) or runtime memory usage(but it is not 0). And what’s difference with memory usage by nvmlDeviceGetProcessUtilization(I know this is runtime memory usage because when inference is idle, the memory usage is 0).