What does nvmlDeviceGetComputeRunningProcesses get?

I use nvmlDeviceGetComputeRunningProcesses to get process memory usage.

result = nvmlDeviceGetHandleByIndex(0, &device);
if (result != NVML_SUCCESS) {
    printf("Failed to get handle for device %d: %s\n", i, nvmlErrorString(result));
}

result = nvmlDeviceGetComputeRunningProcesses(device, &infoCount, info);
if (result == NVML_SUCCESS) {
   for (unsigned int j = 0; j < infoCount; j++) {
        printf("  PID: %u\n", info[j].pid);
         printf("  Memory Utilization: %u\n", info[j].usedGpuMemory);
         printf("--------------------------\n");
    }
}

the result and the nvidia-smi output are following

PID: 1240406
Memory Utilization: 4120903680

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.08             Driver Version: 535.161.08   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA L4                      On  | 00000000:65:01.0 Off |                    0 |
| N/A   34C    P0              27W /  72W |  12130MiB / 23034MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A   1240406      C   python                                    12122MiB |
+---------------------------------------------------------------------------------------+

The python is an idle inference.
So what does usedGpuMemory mean? Total memory usage(but it is different from nvidia-smi) or runtime memory usage(but it is not 0). And what’s difference with memory usage by nvmlDeviceGetProcessUtilization(I know this is runtime memory usage because when inference is idle, the memory usage is 0).