HELP! GPU passthrough not working after HOST reboot

Hi

I have a Intel NUC 9 with a Quadro P2200. ESXi 7.01 is installed on the nuc.

I have enabled passthrough on the host and rebooted the nuc.

Passed both parts of the GPU - Audio and Video to the Ubuntu 20.04 VM

Added both entries on the VMs advanced settings

SVGA.Present FALSE

hypervisor.cpuid.v0 FALSE

If i build the VM fresh, install drivers via ubuntu-drivers devices it works fines. NVIDIA0-smi shows the device and details fine. Hardware transcoding works fine.

I can then reboot / shutdown this vm and it still works fine. The issue begins when i reboot the host.

If the host is rebooted the vm stlll boots up but hardware transcoding stops, the command nvidia-smi returns “no devices found”

I can however still see the device if i run lspci | grep VGA

lspci |grep VGA
13:00.0 VGA compatible controller: NVIDIA Corporation GP106GL [Quadro P2200] (rev a1)
~$ lspci -k |grep -EA3 'VGA'
13:00.0 VGA compatible controller: NVIDIA Corporation GP106GL [Quadro P2200] (rev a1)
        DeviceName: pciPassthru0
        Subsystem: NVIDIA Corporation GP106GL [Quadro P2200]
        Kernel driver in use: nvidia

~$nvidia-smi
No devices were found

My NUC does have the onboard graphis set as the default so i can see it boot into esxi if a cable is plugged into the onboard HDMI port. So not sure if this is an issue. if it was then it wouldn’t have worked on the first instance.

so i’m pretty sure rebooting the HOST(NUC) is what is causing this issue.

The GPU is still showing assigned to the vm if go to the vm and click edit settings. Still shows as active under the hardware tab as well.

Can’t seem to browse to /dev/dri on the Ubuntu VM either after host reboot.

I can browse to that directory via winscp and can see Card0 and renderD128 within the directry /dev/dri

so not sure if this has something to do with it?

Hello,

Did you ever resolve this? I have the exact same issue.

Thank you

For some reason if you don’t disable the card before you shutdown, it causes this state. It’s likely a bug in the driver. You can create a group policy that runs a startup and shutdown script to enable and disable the card. It works for me. The only issue is when I do windows updates, the script wont run on reboots.