RTX 3060 PCI passthrough to guest under KVM(qemu)

I’m trying to passthrough RTX 3060 to instance (virtual machine) under KVM (qemu).
Host is ubuntu20.04. VM is Win10.
Host bios config is :
intel_iommu=on iommu=pt vfio-pci.ids=10de:2487,10de:228b vfio-pci.disable_idle_d3=1

Qemu command line is:
-device vfio-pci,host=0000:18:00.0,id=hostdev0,bus=pci.0,addr=0x9
-device vfio-pci,host=0000:18:00.1,id=hostdev1,bus=pci.0,addr=0xa

At first, it worked very well. VM can use RTX 3060 normally.
I can check RTX 3060 on host as following :

# lspci | grep NVIDIA
18:00.0 VGA compatible controller: NVIDIA Corporation Device 2487 (rev a1)
18:00.1 Audio device: NVIDIA Corporation Device 228b (rev a1)

# lspci -s 18:00.0 -vv
18:00.0 VGA compatible controller: NVIDIA Corporation Device 2487 (rev a1) (prog-if 00 [VGA controller])
		Subsystem: NVIDIA Corporation Device 1530
		Physical Slot: 6
		Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx-
		Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
		Latency: 0, Cache Line Size: 32 bytes
		Interrupt: pin A routed to IRQ 11
……
……

# lspci -t
……
+-[0000:17]-+-00.0
|           +-00.1
|           +-00.2
|           +-00.4
|           \-02.0-[18]--+-00.0
|                        \-00.1
……

Then I restart VM many times, I lost RTX 3060.
At this time, I check RTX 3060 on host

# lspci | grep NVIDIA
18:00.0 VGA compatible controller: NVIDIA Corporation Device 2487 (rev ff)
18:00.1 Audio device: NVIDIA Corporation Device 228b (rev ff)

# lspci -s 18:00.0 -vv
18:00.0 VGA compatible controller: NVIDIA Corporation Device 2487 (rev ff) (prog-if ff)
        !!! Unknown header type 7f
        Kernel driver in use: vfio-pci
        Kernel modules: nvidiafb, nouveau

# dmesg | grep vfio
[ 1523.197552] vfio-pci 0000:51:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
[ 1523.197564] vfio-pci 0000:51:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[ 1523.197568] vfio-pci 0000:51:00.0: vfio_ecap_init: hiding ecap 0x26@0xc1c
[ 1523.197569] vfio-pci 0000:51:00.0: vfio_ecap_init: hiding ecap 0x27@0xd00
[ 1523.197570] vfio-pci 0000:51:00.0: vfio_ecap_init: hiding ecap 0x25@0xe00
[ 1523.198897] vfio-pci 0000:51:00.1: vfio_ecap_init: hiding ecap 0x25@0x160
[ 1524.421723] vfio-pci 0000:51:00.1: vfio_bar_restore: reset recovery - restoring BARs
[ 1524.421771] vfio-pci 0000:51:00.0: vfio_bar_restore: reset recovery - restoring BARs
[ 1525.165440] vfio-pci 0000:51:00.0: timed out waiting for pending transaction; performing function level reset anyway
[ 1526.413440] vfio-pci 0000:51:00.0: not ready 1023ms after FLR; waiting
[ 1527.469443] vfio-pci 0000:51:00.0: not ready 2047ms after FLR; waiting
[ 1529.581440] vfio-pci 0000:51:00.0: not ready 4095ms after FLR; waiting
[ 1533.933439] vfio-pci 0000:51:00.0: not ready 8191ms after FLR; waiting
[ 1542.381440] vfio-pci 0000:51:00.0: not ready 16383ms after FLR; waiting
[ 1559.789426] vfio-pci 0000:51:00.0: not ready 32767ms after FLR; waiting
[ 1567.229554] vfio-pci 0000:18:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
[ 1567.229567] vfio-pci 0000:18:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[ 1567.229572] vfio-pci 0000:18:00.0: vfio_ecap_init: hiding ecap 0x26@0xc1c
[ 1567.229573] vfio-pci 0000:18:00.0: vfio_ecap_init: hiding ecap 0x27@0xd00
[ 1567.229574] vfio-pci 0000:18:00.0: vfio_ecap_init: hiding ecap 0x25@0xe00
[ 1567.231029] vfio-pci 0000:18:00.1: vfio_ecap_init: hiding ecap 0x25@0x160
[ 1594.605404] vfio-pci 0000:51:00.0: not ready 65535ms after FLR; giving up
[ 1595.006376] vfio-pci 0000:51:00.0: vfio_bar_restore: reset recovery - restoring BARs
[ 1595.014162] vfio-pci 0000:51:00.1: vfio_bar_restore: reset recovery - restoring BARs
[ 1821.133200] vfio-pci 0000:51:00.0: timed out waiting for pending transaction; performing function level reset anyway
[ 1822.381198] vfio-pci 0000:51:00.0: not ready 1023ms after FLR; waiting
[ 1823.437196] vfio-pci 0000:51:00.0: not ready 2047ms after FLR; waiting
[ 1825.517195] vfio-pci 0000:51:00.0: not ready 4095ms after FLR; waiting
[ 1829.869194] vfio-pci 0000:51:00.0: not ready 8191ms after FLR; waiting
[ 1838.317188] vfio-pci 0000:51:00.0: not ready 16383ms after FLR; waiting
[ 1856.749168] vfio-pci 0000:51:00.0: not ready 32767ms after FLR; waiting
[ 1891.565131] vfio-pci 0000:51:00.0: not ready 65535ms after FLR; giving up

I attempted rescan PCI device manually, but I lost the device forever.

# file /sys/devices/pci0000:17/0000:17:02.0/0000:18:00.0
/sys/devices/pci0000:17/0000:17:02.0/0000:18:00.0: directory
# echo 1 > /sys/devices/pci0000:17/0000:17:02.0/0000:18:00.0/remove
# echo 1 > /sys/devices/pci0000:17/0000:17:02.0/0000:18:00.1/remove
# echo 1 > /sys/devices/pci0000:17/0000:17:02.0/rescan
# file /sys/devices/pci0000:17/0000:17:02.0/0000:18:00.0
/sys/devices/pci0000:17/0000:17:02.0/0000:19:00.0: cannot open `/sys/devices/pci0000:17/0000:17:02.0/0000:19:00.0' (No such file or directory)
# lspci | grep NVIDIA
(display nothing)

The only method is reboot host.
Any idea is appreciate.

I have two 3060 cards on my host, named 0000:18:00.0 and 0000:51:00.0. So the kernel log with 0000:51:00.0 is the same as 0000:18:00.0

on the host from lspci tells the gpus are turned off. Might be a power issue.

Thanks for reply.

“Might be a power issue.”
How can I confirm it?

Only when VM power on, power off or reboot many times, this problem occurred. If VM’s status haven’t change, it’s always normal.

IDK, it’s a difficult setup to get any info from. Only the Windows eventlog might contain something but the nvidia windows driver doesn’t have good logging.

When VM is centos, the problem also occurred. Is there any way to skip it?

Then please use the centos vm and run nvidia-bug-report.sh as root after the issue occured and attach the resulting nvidia-bug-report.log.gz file to your post.

I try it. Thank a lot.

Sorry for the late reply. Here is my log, see attachment.
Host : ubunt 20.04
VM : centos 8
When RTX3060 lost , I collected log on host.
nvidia-bug-report.log (312.9 KB)

Hi,sir,could you give me a hand again? Have a look about my log.

Besides, soft reboot host can return to normal. So I wonder if there is a way to restore RTX3060 after the issue occured. I tried rescan, but it doesn’t work. Now I’m looking for power reset method about RTX3060 pcie device.

That looks like a log from the host, please attach one from the centos vm.

nvidia-bug-report.log (89.2 KB)
Attachment is centos vm log. Thanks a lot.

Nothing useful in the logs. Like said, a setup like this is not really debuggable.
To rule out power issues, does it reliably work if you physically remove one of the gpus?

I don’t understand . I have two 3060 cards. Now only one of them lost. Which one should be removed? Just remove it or hotplug it?
I guess you let me hotplug the bad one and observe whether it returns to normal. Does it right?

Hi, sir, Can I attempt to turn on and off GPU card by using bbswitch. I had a problem when loading bbswitch.

[ 3737.179360] bbswitch: version 0.8
[ 3737.179374] bbswitch: Found discrete VGA device 0000:06:00.0: \_SB_.PC00.RP06.VB00.D031
[ 3737.179376] bbswitch: Found discrete VGA device 0000:18:00.0: \_SB_.PC01.BR1A.H000
[ 3737.179379] bbswitch: Found discrete VGA device 0000:51:00.0: \_SB_.PC02.BR2C.H000
[ 3737.179414] bbswitch: failed to evaluate \_SB_.PC02.BR2C.H000._DSM {0xF8,0xD8,0x86,0xA4,0xDA,0x0B,0x1B,0x47,0xA7,0x2B,0x60,0x42,0xA6,0xB5,0xBE,0xE0}0x100 0x0 {0x00,0x00,0x00,0x00}: AE_NOT_FOUND
[ 3737.179417] bbswitch: failed to evaluate \_SB_.PC02.BR2C.H000._DSM {0xA0,0xA0,0x95,0x9D,0x60,0x00,0x48,0x4D,0xB3,0x4D,0x7E,0x5F,0xEA,0x12,0x9F,0xD4}0x102 0x0 {0x00,0x00,0x00,0x00}: AE_NOT_FOUND
[ 3737.179417] bbswitch: No suitable _DSM call found.

bbswitch is for notebooks only.

I’m using Ubuntu22.04 on host and can check card status by /sys/bus/pci/devices/0000:18:00.0/power_state.
I disabled d3hot by setting vfio_pci.disable_idle_d3=1 in grub, so normally gpu card always stays d0 status.

root@POD209-CLU01-H012:/sys/bus/pci/devices/0000:18:00.0# cat power_state
D0
root@POD209-CLU01-H012:/sys/bus/pci/devices/0000:18:00.0#
root@POD209-CLU01-H012:/sys/bus/pci/devices/0000:18:00.0# setpci -s 18:00.0 60.l     // 60 is Power Management Capabilities Register
48036801

When problem occurred, power_state changed to D3cold.

root@POD209-CLU01-H045:/sys/bus/pci/devices/0000:18:00.0# cat power_state
D3cold

Does it mean main power is removed exceptionally. Any idea for that?

Absolutely no idea how that happens.