I have 3 Dell PowerEdge 740 servers with 2x Tesla M10 in each server. All hosts are running ESXi 6.7 U2 with "NVIDIA-VMware_ESXi_6.7_Host_Driver" vib in version 430.27-1OEM.6184.108.40.20669922.
Graphics type for all hosts is set to "Shared Direct".
In general the GPUs and vGPUs are recognized by the guest operating system (Win Server 2016/2019) and can be used.
But apparently there are 2 problems which have the same effect.
With 3 hosts, 2x M10 in each host, I should be able to deploy 12 VMs with "M10-8A" profile. Right? Or 24 VMs with "M10-4A" profile, for example.
But at the moment I only have the following vGPU assignments in the cluster:
And I cannot start another VM with a "M10-4A" vGPU profile…
If I look at the GPUs and which VMs are running on them, then I see that all GPUs except 1 have 0 bytes memory and all VMs on this host are running on the same GPU.
The other 2 hosts show all available memory (8x 8 GB), but only 2 VMs are running on these hosts, each VM on a vGPU.
I’m completely confused why I can’t start more virtual machines with vGPUs in this cluster.
Maybe someone has an idea.
Thank you all.