Ubuntu 18.04 Host with KVM Hypervisor : Quadro RTX 4000 GPU Card is not accessible on Windows Server 2016 Guest VM

prabhu.k · February 8, 2021, 6:59am

Hello Team,

Observing an issue while accessing Nvidia RTX 4000 GPU card in Windows server 2016 over KVM hypervisor based virtualization host.

Though the GPU card is visible under Display Adapter section of Device manager and corresponding driver installed, Windows is stopping the device due to some issue with error code 43.

Rechecked host configurations like enabling IOMMU flag and vfio-pci binding

Setup Details:

Dell PowerEdge server with VT-d enabled
Ubuntu 18.04 OS , Libvirt, Qemu and KVM installed
Windows Server 2016 as Guest VM
Passed GPU card via PCI Passthrough method

Host Configuration:

dmesg outputs

root@moving-deer:~# dmesg | grep -e DMAR -e IOMMU
[ 0.000000] ACPI: DMAR 0x000000006F6C2000 0001E0 (v01 DELL PE_SC3 00000001 DELL 00000001)
[ 0.000000] DMAR: IOMMU enabled
[ 1.478298] DMAR: Intel(R) Virtualization Technology for Directed I/O

[ 35.159689] vfio-pci 0000:d8:00.0: vgaarb: changed VGA decodes:
olddecodes=io+mem,decodes=io+mem:owns=none
[ 235.260035] vfio-pci 0000:d8:00.0: enabling device (0000 → 0003)
[ 235.368328] vfio_ecap_init: 0000:d8:00.0 hiding ecap 0x1e@0x258
[ 235.368353] vfio_ecap_init: 0000:d8:00.0 hiding ecap 0x19@0x900
lspci outputs

root@moving-deer:~# lspci -nnk | grep -i d8:00 -A 3
d8:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:1eb1] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:12a0]
Kernel driver in use: vfio-pci
Kernel modules: nvidiafb
d8:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10f8] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:12a0]
Kernel driver in use: vfio-pci
d8:00.2 USB controller [0c03]: NVIDIA Corporation Device [10de:1ad8] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:12a0]
Kernel driver in use: vfio-pci
d8:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device [10de:1ad9] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:12a0]
Kernel driver in use: vfio-pci

generix · February 8, 2021, 9:28am

This is on purpose, you’ll have to hide the hypervisor.

prabhu.k · February 8, 2021, 9:40am

Yes. It has done already.

<kvm>
  <hidden state='on'/>
</kvm>

generix · February 8, 2021, 10:02am

You might also need

<ioapic driver='kvm'/>

and

  <hyperv>
    ...
    <vendor_id state='on' value='someid'/>
    ...
  </hyperv>

prabhu.k · February 8, 2021, 11:54am

Vendor ID part is there already.

Only ioapic was missing. I added and checked it, still same issue.

prabhu.k · February 8, 2021, 11:59am

<hostdev mode='subsystem' type='pci' managed='yes'>
  <source>
    <address domain='0x0000' bus='0xd8' slot='0x00' function='0x0'/>
  </source>
  <rom file='/usr/share/kvm/vbios.bin'/>
  <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0' multifunction='on'/>
</hostdev>

Tried with rom file option in hostdev section.

generix · February 8, 2021, 4:49pm

Did you add any extra kvm config during setup?

prabhu.k · February 8, 2021, 5:10pm

Added few options in grub file.

root~# cat /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT=“intel_iommu=on kvm.ignore_msrs=1 vfio-pci.ids=10de:1eb1,10de:10f8,10de:1ad8,10de:1ad9”

root~# cat /etc/modprobe.d/vfio.conf
options vfio-pci ids=10de:1eb1,10de:10f8,10de:1ad8,10de:1ad9 disable_vga=1

And, kvm-spice emulator is used for Guest VM domain.
/usr/bin/kvm-spice

prabhu.k · February 12, 2021, 1:08pm

Any suggestions or configurations to try out…?

I missed to update version of Power Edge server. Dell Power Edge R740

generix · February 12, 2021, 1:16pm

Only thing that comes to my mind is to make sure you also passed through the subdevices, not just the main gpu function.

prabhu.k · February 12, 2021, 1:23pm

Thanks for your quick response.

Actually I tried passing all PCIs which comes under main GPU.

As an alternate option, Tried accessing RTX4000 GPU directly over bare metal (Dell Power Edge R740) by installing same Windows Server 2016 OS.

So, The observation is same on both virtualization and bare metel configurations.

Even, Its confirmed that PCIe 8-pin connector is powering the GPU.

generix · February 12, 2021, 1:25pm

So the device also doesn’t work with win 2016 bare-metal? Seems broken, then.

prabhu.k · February 12, 2021, 1:30pm

Yes., the issue is same on both configurations, but on two different Dell Power Edge R740 servers.

One server installed with Ubuntu 18.04 virtualization + Windows Server 2016 Guest VM to access GPU
Another server installed with Windows Server 2016 directly to access GPU

generix · February 12, 2021, 1:33pm

I guess you’ll need to have it replaced by your vendor if still under warranty.
If you want detailed info, install the driver in the Ubuntu host OS and run nvidia-bug-report.sh as root and attach the resulting nvidia-bug-report.log.gz to your post.

Topic		Replies	Views
GTX 1080 & KVM PCI passthrough to guest CUDA Setup and Installation	12	17442	February 23, 2017
Power off or Reboot Ubuntu VM with GPU Passthru disables GPU NVIDIA Virtual GPU Technology ubuntu	1	1439	June 3, 2022
How to enable GPU passthrought on ubuntu(Guest: windows) Linux	6	2513	May 20, 2021
Vfio passthrough to ubuntu20.04 guest GeForce RTX 2070 - 440.100 - RmInitAdapter failed! (0x23:0x56:515) Linux	2	1320	October 12, 2021
GPU in a VM pass-through setting NVIDIA Virtual GPU Drivers	19	69961	April 29, 2021
Trying to get discrete laptop GPU running in QEMU KVM Windows Linux	20	3945	February 12, 2023
P4000 passthrough to server 2012R2 with DDA, error code 43 General Discussion	7	9265	November 9, 2018
Could not install Nvidia driver in GPU-passthrough VM Linux graphics , linux	19	2151	April 4, 2024
Configuration suggestions for Dell VRTX / Q M4000 / WinServ2012R2 NVIDIA Virtual GPU Drivers	3	1252	September 20, 2024
VFIO VGA arbitration lock Linux	13	16173	March 20, 2016

Ubuntu 18.04 Host with KVM Hypervisor : Quadro RTX 4000 GPU Card is not accessible on Windows Server 2016 Guest VM

Related topics