Vfio-pci reservation does not work with two nVidia GPUs and nVidia driver 525.85.05: Failed to initialize during X server startup

I want to create a VM with GPU passthrough (KVM). Recently I upgraded my system from Ubuntu 20.04 to 22.04.1 (Linux Mint 21.1), and immediately had a problem with driverctl hanging during startup. I have uninstalled driverctl and changed GPU reservation to the GRUB method. Ubuntu 22.04 has compiled in vfio (no modules), so this is the way to go.

Different kernels I have tried to no avail: 6.1, 6.0, 5.17, and 5.15. Thus I have chosen the 5.15 kernel which is the main kernel of the distribution.

Nouveau driver does work: I have an X session (lightdm). But there is a mouse lag, and I want to use the full-featured nVidia driver for the host system.

The HW change was the removal of a broken SSD which moved the GPUs one up in PCI enumeration. Hopefully there is no persistence somewhere storing outdated enumeration. The GRUB method uses PCI vender/device id and thus is independent from PCI enumeration, no change needed.

Single monitor: LG 43UN700-B 3840x2160x60Hz (DP and HDMI3 inputs)
GPU-0: nVidia GT 1030 (DP, for host, nvidia, in other PCI slot)
GPU-1: nVidia RTX 2070 (HDMI3, for guest, vfio-pci, in GPU PCI slot and primary adapter)
Motherboard: Asus X570 ProArt Creator with BIOS 0904
CPU: AMD Ryzen 9 5950X (16 cores)
Mem: 64 GB DDR4-3200 ECC RAM (2x 32 GB)

GRUB kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.15.0-58-generic root=UUID=527bf775-0c36-4143-9409-8acbd155197f ro amd_iommu=on iommu=pt nvidia_drm.modeset=1 vfio-pci.ids=10de:1f02,10de:10f9,10de:1ada,10de:1adb quiet

X server: xorg-server 2:21.1.3-2ubuntu2.5

The X server crashed during startup with the nvidia driver 525.85.05. I had tested 525.60 and 525.78 as well before with the same outcome. A fresh installation of Ubuntu 22.04.1 (not Mint) on a spare partition led to the same behavior with 525.78.

I can use the nVidia driver if I let it grab both GPUs by removing the vfio-pci parameter in the GRUB kernel command line. In that case both GPUs are used for the X session, and one is selected randomly for the chooser/login widget.

During bootup GPU-1 is used because it is the primary adapter, and in BIOS I cannot change this.
nvidia-bug-report.log.gz (423.1 KB)

Some additional remarks: I have installed Ubuntu 22.04.1 after removing one SSD, so PCI enumeration is not the culprit. On the Ubuntu system, nVidia driver 525-open was installed. After changing it to driver 525, the system is working. Further investigation shows that this change led gdm to start a wayland session instead of an X session. I have tried Firefox in default mode (Xwayland), and it is working. Setting Firefox to wayland works as well. If I disable wayland by uncommenting WaylandEnable=false in /etc/gdm3/custom.conf, the issue reoccurs.