I have created a vGPU with UUID def87179-9c53-42d7-b224-a5d281037b84. The license server is running, and I’ve provided GRID-Virtual App and QUADRO-DWS resources to the mac address of the VM.
I get the following output when I try to start my VM:
[root@instance-1 ~]# dmesg
[nvidia-vgpu-vfio] def87179-9c53-42d7-b224-a5d281037b84: start failed. status: 0x65 Timeout Occured
[root@instance-1 ~]# virsh start win10_1
error: Failed to start domain win10_1
error: internal error: process exited while connecting to monitor: Verify all devices in group 0 are bound to vfio-pci or pci-stub and not already in use
2019-02-13T15:11:50.129364Z qemu-kvm: -device vfio-pci,id=hostdev0,sysfsdev=/sys/bus/mdev/devices/def87179-9c53-42d7-b224-a5d281037b84,display=off,bus=pci.0,addr=0x8: vfio: failed to get device def87179-9c53-42d7-b224-a5d281037b84
2019-02-13T15:11:50.129455Z qemu-kvm: -device vfio-pci,id=hostdev0,sysfsdev=/sys/bus/mdev/devices/def87179-9c53-42d7-b224-a5d281037b84,display=off,bus=pci.0,addr=0x8: Device initialization failed.
2019-02-13T15:11:50.129479Z qemu-kvm: -device vfio-pci,id=hostdev0,sysfsdev=/sys/bus/mdev/devices/def87179-9c53-42d7-b224-a5d281037b84,display=off,bus=pci.0,addr=0x8: Device 'vfio-pci' could not be initialized
I’ve tried GRID P100-1Q, P100-16Q, and P100-1A vGPUs with the same results. Further, while I can see the device’s uuid listed in the mdev/devices, I get the following when I run the following:
[root@instance-1 ~]# nvidia-smi vgpu -q
GPU 00000000:00:04.0
Active vGPUs : 0
[root@instance-1 ~]# nvidia-smi vgpu -c
GPU 00000000:00:04.0
GRID P100-1Q
I am running qemu-kvm version 1.5.3 and RHEL 7.6 with kernel 3.10.0-957.el7.x86_64. Here’s the relevant portion of my VM’s XML file:
<hostdev mode='subsystem' type='mdev' managed='no' model='vfio-pci' display='off'>
<source>
<address uuid='def87179-9c53-42d7-b224-a5d281037b84'/>
</source>
<address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
</hostdev>