Unbinding/isolating a card is difficult post-470

I have a pair of NVIDIA GPUs in my desktop for PCI passthrough virtual machines using VFIO on Ubuntu 20.04, kernel 5.15.0-43. A 3090 Ti FE for the host, and a 750 Ti for the guest.

At some point after the 470 series drivers (490+, including the current 515.65.01), it has become increasingly difficult to hot-unbind the card from the nvidia kernel modules. I’ve encountered two specific bits of the NVIDIA drivers which hold on to the card when they maybe shouldn’t:

  1. nvidia-drm
    When nvidia-drm is loaded with the modeset = 1 parameter, it will hold on to all GPUs until it is unloaded. GPUs that are bound later to nvidia are not affected, as it seems that it’s the nv_drm_probe_devices function in nvidia-drm-drv.c which is grabbing the GPUs and it’s only called when the nvidia-drm driver is loaded. When attempting to unload nvidia while nvidia-drm is holding on to the card results in this:
    NVRM: Attempting to remove device 0000:05:00.0 with non-zero usage count!

  2. Something in the userspace drivers
    On older drivers (<= 470) using X11, I was able to isolate the guest GPU’s /dev/nvidiaX device from userspace applications with an X11 config containing Option "AutoAddGPU" "false" in the ServerLayout section, and the NVIDIA-specific Option "ProbeAllGpus" "false" under the Device section.
    This no longer seems to work as the userspace components seem to always be picking up the guest GPU (visible by Xorg, Firefox, etc holding the /dev/nvidiaX device open) despite my X11 config only specifying to use the host GPU. Another shortfall is there doesn’t seem to be a way to achieve the same isolation in Wayland?

Basically what I’m looking for is a way to hot-unbind a card as needed, and be able to easily isolate it from userspace programs to facilitate the unbinding. I need to keep the guest card bound to the nvidia driver while my VMs are not running to ensure proper power management and to check the status with nvidia-smi, so leaving it completely unbound always isn’t a solution; plus it’s nice to be able to run headless compute or encoding on it while it’s not attached to a VM.

I don’t really need nvidia-drm modesetting on the guest card, so a way to make that module ignore cards passed via a parameter would work for me, but getting true hotplug working would be ideal.

My current workaround is to late-bind the guest card to nvidia and simply not give it a /dev/nvidiaX node, but this isn’t great.

Same regression observed for me:

  • A long time ago, I could unbind my eGPU, and disable it with the use of bbswitch, for longer autonomy on my laptop.
  • More recently, when attempting to unbind nvidia (and this, without any GUI running), I get:
    NVRM: Attempting to remove device 0000:01:00.0 with non-zero usage count!
    Unlessnvidia-drm was loaded with modeset=0.