DCGM Not reporting running processes

Hi - I’m trying to use Datacenter GPU Manager (DCGM) to gather info about running processes on a GPU. My code is fairly straightforward:

import dcgm_fields
from DcgmReader import DcgmReader

def get_gpu_info():
    myFieldIds = [ dcgm_fields.DCGM_FI_DEV_NAME,
                   dcgm_fields.DCGM_FI_DEV_UUID,
                   dcgm_fields.DCGM_FI_DEV_FB_TOTAL,
                   dcgm_fields.DCGM_FI_DEV_COMPUTE_PIDS 
]
    dr = DcgmReader(fieldIds=myFieldIds)
    dr_gpu_data = dr.GetLatestGpuValuesAsFieldIdDict()
    gpu_data = {}
    for gpu, gpu_info in dr_gpu_data.items():
        print(gpu_info)
        gpu_data[gpu] = { 'model': gpu_info[50], 'gpu_id': gpu_info[54], 'gpu_compute_pids': gpu_info[221] }
    return gpu_data

When I run nvidia-smi I see the tensorflow processes bound to each of my GPUs and they all have “C” compute capability. However when I run the above function all the values are returned as None.

>>> get_gpu_info()
{50: 'Tesla V100-SXM2-32GB', 250: 32480, 54: 'GPU-c97bfcc0-f899-101c-ef1d-xxxxxxxx', 221: None}

Hello,

I am not an expert on DCGM, the best I can offer you is our Data Center GPU Manager User Guide.

https://docs.nvidia.com/datacenter/dcgm/latest/dcgm-user-guide/index.html

Hope this helps you resolve your issue.

Best regards,

Tom
Devtalk Community Manager