Nvidia-smi GPU metrics shows "combined" results when there are multiple vGPUs running on the same card

Hi, I am trying to gather gpu and memory metrics from each individual vGPU. This is my current setup
±----------------------------------------------------------------------------+
| NVIDIA-SMI 525.105.14 Driver Version: 525.105.14 |
|---------------------------------±-----------------------------±-----------+
| GPU Name | Bus-Id | GPU-Util |
| vGPU ID Name | VM ID VM Name | vGPU-Util |
|=================================+==============================+============|
| 0 NVIDIA A16 | 00000000:47:00.0 | 96% |
| 3252070918 NVIDIA A16-4A | 9107803 vc14wingpu01 | 95% |
| 3252071060 NVIDIA A16-4A | 9108788 vc14wingpu02 | 0% |
±--------------------------------±-----------------------------±-----------+

I am running a couple of vms in vCenter 8.0U1 on a esxi host configured to run the GPU in Shared Direct mode (vendor shared passthrough graphics). Both of these VMs are running Windows 10 and there is a constant gpu stress test that pushes each of the VMs GPU utilization to ~50%.

Issue: I am using the command nvidia-smi vgpu -q -i 00000000:47:00.0 -u to gather instantaneous utilization of the vGPU and comparing it with the metrics that is reported by the Windows operating system (task manager performance tab). Nvidia-smi reports the following:

GPU vGPU sm mem enc dec
Idx Id % % % %
0 3252070918 94 0 0 0
0 3252071060 0 0 0 0
0 3252070918 94 0 0 0
0 3252071060 0 0 0 0

i.e. The utilization is always 0 in one of the vGPUs, and the utilization on the other vGPU is the “combined” utilization of both of Vms together (reported by the OS on each vm).

Questions:

  1. Why does the utilization reported by the OS and Nvidia-smi differ when there are multiple vGPU running on the same card?
  2. Is there a way using nvidia-smi, to gather metrics that closely matches what the OS reports (because in this case the metrics reported by the OS is more accurate)?

Hello Prasad,

Thank you for opening a thread on the NVIDIA developer forum.

For support on this issue, if you have a active support contract, we invite you to open a case with the NVIDIA Enterprise Support team who will be able to best assist you with this issue. A case can be opened through the support portal here:

Thank you,
Abigail