ESXi 6.7: vGPU VM cannot start if large memory VM with vGPU already started

I have an ESXi 6.7 host with 2 x V100 GPUs. There is 384 GB RAM on the host. I have two VMs with vGPU profiles: grid_v100d-8c. Machine A has 16 GB RAM allocated and machine B has 256 GB RAM.

If A is running, B can be started and both operate fine. However if B is running first, A cannot start and will give the error:

The amount of graphics resource available in the parent resource pool is insufficient for the operation.

It does not make sense as there is plenty of graphics resource available and adequate RAM.

While this might not seem like a problem as both machines can be made to boot, I cannot then add any further machines without stopping the larger one. So am only able to use 16/64GB of GPU memory.

We’re currently running 460.73.02. Any advice would be greatly appreciated.

Hi Ben,

Please see this article - vGPU VMs might fail to boot on ESXi 6.5 and 6.7 with multiple GPUs even if the graphics type is Shared Direct (nvidia.com). Please upgrade to ESXi 6.7 Update 3 to resolve the issue.

If this doesn’t resolve your problem - let me know.
D

Hi Doug,

Many thanks for your response. We are running ESXi 6.7 P05 (so more recent than U3). I have checked the article but the memory is being reported correctly on our system:

Hi Ben,

Can you open a support ticket and our customer support team will be able to assist you further.

thanks
D