I updated our ESXi hosts to NVIDIA-VMware-460.32.04-1OEM.670.0.0.8169922.x86_64_vib and our Windows 10 Images to 461.33_grid_win10_server2016_server_2019_64bit_international.exe and now I can no longer provision any instant clone desktops. Its the usual UKNOWN_FAULT_FATAL - No GPU capable host available for provisioning with grid profile grid_m10-1b error.
The VMs themselves won’t initialise the display driver and are falling back to the default Windows SVGA drivers. The hosts themselves won’t even run the nvidia-smi command (Failed to initialize NVML: Unknown Error).
Both drivers are from the latest release - NVIDIA-GRID-vSphere-6.7-460.32.04-460.32.03-461.33.zip and are therefore at parity (one would assume!).
Any ideas what I can do. We’re running Dell R740 servers with 4 x Tesla M10 cards in each.
I now only have a single host left with all our VDI’s running on it!
Can anyone else out there who is using 12.0 or 12.1 drivers in vSphere confirm they are working for them. I have tried for 3 days now to get this working without any success. I have removed the VIBs rebooted, re-added the VIBS via the zip package with esxcli software vib install -d and just the vib itself with esxcli software vib install -v. DMESG shows no issues whatsoever as if the driver has loaded and vmkload_mod -l | grep nvidia confirms the driver is loaded…
I have confirmed SR-IOV setting is enabled in the BIOS and IOMMU settings as well. Previous drivers work fine. Just not 12.0 or 12.1 which is a pain as we need Windows 10 20H2 support.
It sounds like the issue is with the VM side, not with Hypervisor side. Have you tried building a clean VM, install the new vGPU driver and Horizon agent with Direct Connect (save you creating a Pool on the Connection Server) and seeing if that works?
What vGPU profile are you using on your VMs?
I’m running 12.0 and 12.1 on multiple environments, but I’m running 7.0 U1C and 7.0 U2 so it’s not a fair comparison, but in those environments there are no issues.
Thanks for your input. Unfortunately as I described as soon as the VIBs are installed on the ESXI instances the ‘Graphics’ section under the host configuration in vSphere shows the GPU’s with 0.00B memory and the nvidia-smi command reports (Failed to initialize NVML: Unknown Error). Therefore the issue is likely with the drivers themselves and ESXi 6.7.0 (16713306).