Nvidia-smi GPU metrics shows "combined" results when there are multiple vGPUs running on the same card

padmanaban.prasad · September 21, 2023, 6:01pm

I am running a couple of vms in vCenter 8.0U1 on a esxi host configured to run the GPU in Shared Direct mode (vendor shared passthrough graphics). Both of these VMs are running Windows 10 and there is a constant gpu stress test that pushes each of the VMs GPU utilization to ~50%.

Issue: I am using the command nvidia-smi vgpu -q -i 00000000:47:00.0 -u to gather instantaneous utilization of the vGPU and comparing it with the metrics that is reported by the Windows operating system (task manager performance tab). Nvidia-smi reports the following:

GPU vGPU sm mem enc dec
Idx Id % % % %
0 3252070918 94 0 0 0
0 3252071060 0 0 0 0
0 3252070918 94 0 0 0
0 3252071060 0 0 0 0

i.e. The utilization is always 0 in one of the vGPUs, and the utilization on the other vGPU is the “combined” utilization of both of Vms together (reported by the OS on each vm).

Questions:

Why does the utilization reported by the OS and Nvidia-smi differ when there are multiple vGPU running on the same card?
Is there a way using nvidia-smi, to gather metrics that closely matches what the OS reports (because in this case the metrics reported by the OS is more accurate)?

abigail3 · September 25, 2023, 9:10pm

Hello Prasad,

Thank you for opening a thread on the NVIDIA developer forum.

For support on this issue, if you have a active support contract, we invite you to open a case with the NVIDIA Enterprise Support team who will be able to best assist you with this issue. A case can be opened through the support portal here:

Thank you,
Abigail

Topic		Replies	Views
Nvidia-SMI reporting 0% gpu utilization Drivers - Linux, Windows, MacOS linux , nvidia-smi , linux-driver	2	3955	August 3, 2023
GPU utilization DGX User Forum	8	6335	August 21, 2019
cudaMemGetInfo returns similar result for 3 different GPUs CUDA Programming and Performance cuda , nvbugs	5	349	January 23, 2024
Questions about nvidia-smi CUDA Programming and Performance	2	2039	February 23, 2011
Enquiry about "GPU-Util" item from nvidia-smi output CUDA Setup and Installation	2	1283	April 30, 2020
Can't use nvidia-smi on VM General Discussion ubuntu , nvidia-smi	3	1793	March 30, 2023
per-process resource accounting CUDA Programming and Performance	2	2663	December 22, 2022
Nvidia-smi and nvmlDeviceGetUtilizationRates do not match System Management and Monitoring (NVML)	0	970	May 24, 2022
nvidia-smi on ESXi reports 0% vGPU-Util Monitoring/Assessment Tools	0	10649	June 21, 2017
Nvidia-smi failed to detect all GPU cards CUDA Setup and Installation	11	13004	December 14, 2018

Nvidia-smi GPU metrics shows "combined" results when there are multiple vGPUs running on the same card

Related topics