[Q] vGPU on Ubuntu 24.04 (GPU: A10). livirt raise kernel NULL pointer dereference,

Hello, members. Could someone knows how to fix this issue?

  1. Expected behavior
    KVM on Ubuntu, ubuntu starts with virsh start and the vGPU is recognized by nvidia-smi.

  2. Actual behavior

kernel outputs NULL pointer dereference, address: 0000000000000010
If I use libvirt configuration, managed='yes'
like the below:

nvidia-smi worked correctly if the configuration set to managed='no'

  [  216.812092]  vfio_device_set_group+0x19/0x40 [vfio]
  [  216.812099]  __vfio_register_dev+0x6c/0x140 [vfio]
  [  216.812105]  ? __pm_runtime_idle+0x4c/0xd0
  [  216.812114]  vfio_register_group_dev+0x10/0x20 [vfio]
  [  216.812119]  vfio_pci_core_register_device+0x1b6/0x460 [vfio_pci_core]
  [  216.812130]  vfio_pci_probe+0x53/0x140 [vfio_pci]
  [  216.812133]  local_pci_probe+0x44/0xb0
  [  216.812143]  work_for_cpu_fn+0x17/0x30
  [  216.812147]  process_one_work+0x181/0x3a0
  [  216.812150]  worker_thread+0x306/0x440
  [  216.812153]  ? __pfx_worker_thread+0x10/0x10
  [  216.812156]  kthread+0xef/0x120
  [  216.812163]  ? __pfx_kthread+0x10/0x10
  [  216.812167]  ret_from_fork+0x44/0x70
  [  216.812174]  ? __pfx_kthread+0x10/0x10
  [  216.812177]  ret_from_fork_asm+0x1b/0x30
  1. reproduce steps
    (1) /usr/lib/nvidia/sriov-manage -e 0000:ca:00.0
    (2) echo 588 > /sys/bus/pci/devices/0000:ca:00.4/nvidia/current_vgpu_type
    (3) virsh start vgpu1
    (4) BUG: kernel NULL pointer dereference, address: 0000000000000010

  2. environment:
    NVIDIA-GRID-Ubuntu-KVM-580.95.02-580.95.05-581.42
    Gust: nvidia-linux-grid-580_580.95.05_amd64.deb
    Host: nvidia-vgpu-ubuntu-580_580.95.02_amd64.deb
    GPU: A10
    OS: Ubuntu 24.04
    Kernel: 6.8.0-87-generic

lspci | grep NVIDIA
0000:ca:00.0 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:00.4 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:00.5 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:00.6 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:00.7 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:01.0 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:01.1 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:01.2 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:01.3 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:01.4 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:01.5 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:01.6 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:01.7 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:02.0 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:02.1 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:02.2 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:02.3 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:02.4 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:02.5 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:02.6 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:02.7 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:03.0 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:03.1 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:03.2 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:03.3 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:03.4 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:03.5 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:03.6 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:03.7 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:04.0 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:04.1 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:04.2 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
0000:ca:04.3 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)

Guest config xml file

<domain type='kvm' id='2'>
  <name>vgpu1</name>
  <uuid>c0e22088-c99f-4c20-90e8-e67ed37db700</uuid>
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>4194304</currentMemory>
  <vcpu placement='static'>1</vcpu>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os firmware='efi'>
    <type arch='x86_64' machine='pc-q35-8.2'>hvm</type>
    <firmware>
      <feature enabled='no' name='enrolled-keys'/>
      <feature enabled='no' name='secure-boot'/>
    </firmware>
    <loader readonly='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE_4M.fd</loader>
    <nvram template='/usr/share/OVMF/OVMF_VARS_4M.fd'>/var/lib/libvirt/qemu/nvram/vgpu1_VARS.fd</nvram>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>
  <cpu mode='host-passthrough' check='none' migratable='on'/>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <devices>
    <emulator>/usr/bin/qemu-system-x86_64</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/path/to/vgpu1.img' index='1'/>
      <backingStore/>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </disk>
    <controller type='pci' index='0' model='pcie-root'>
      <alias name='pcie.0'/>
    </controller>
    <controller type='pci' index='1' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='1' port='0x8'/>
      <alias name='pci.1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </controller>
    <controller type='pci' index='2' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='2' port='0x9'/>
      <alias name='pci.2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </controller>
    <controller type='pci' index='3' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='3' port='0xa'/>
      <alias name='pci.3'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </controller>
    <controller type='pci' index='4' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='4' port='0x8'/>
      <alias name='pci.4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0' multifunction='on'/>
    </controller>
    <controller type='pci' index='5' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='5' port='0x9'/>
      <alias name='pci.5'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='qemu-xhci'>
      <alias name='usb'/>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </controller>
    <controller type='sata' index='0'>
      <alias name='ide'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:4e:9c:6c'/>
      <source bridge='brbond0'/>
      <target dev='vnet1'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
    </interface>
    <serial type='pty'>
      <source path='/dev/pts/2'/>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
      <alias name='serial0'/>
    </serial>
    <console type='pty' tty='/dev/pts/2'>
      <source path='/dev/pts/2'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>
    <input type='mouse' bus='ps2'>
      <alias name='input0'/>
    </input>
    <input type='keyboard' bus='ps2'>
      <alias name='input1'/>
    </input>
    <audio id='1' type='none'/>
    <video>
      <model type='none'/>
      <alias name='video0'/>
    </video>
    <!---
    HERE HERE HERE HERE HERE
    --->
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0xca' slot='0x00' function='0x4'/>
      </source>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
    </hostdev>
    <watchdog model='itco' action='reset'>
      <alias name='watchdog0'/>
    </watchdog>
    <memballoon model='virtio'>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
    </memballoon>
  </devices>
  <seclabel type='dynamic' model='apparmor' relabel='yes'>
    <label>libvirt-c0e22088-c99f-4c20-90e8-e67ed37db700</label>
    <imagelabel>libvirt-c0e22088-c99f-4c20-90e8-e67ed37db700</imagelabel>
  </seclabel>
  <seclabel type='dynamic' model='dac' relabel='yes'>
    <label>+64055:+994</label>
    <imagelabel>+64055:+994</imagelabel>
  </seclabel>
</domain>


supported GPU types

cat /sys/bus/pci/devices/0000\:ca\:00.4/nvidia/creatable_vgpu_types
ID    : vGPU Name
588   : NVIDIA A10-1B
589   : NVIDIA A10-2B
590   : NVIDIA A10-1Q
591   : NVIDIA A10-2Q
592   : NVIDIA A10-3Q
593   : NVIDIA A10-4Q
594   : NVIDIA A10-6Q
595   : NVIDIA A10-8Q
596   : NVIDIA A10-12Q
597   : NVIDIA A10-24Q
598   : NVIDIA A10-1A
599   : NVIDIA A10-2A
600   : NVIDIA A10-3A
601   : NVIDIA A10-4A
602   : NVIDIA A10-6A
603   : NVIDIA A10-8A
604   : NVIDIA A10-12A
605   : NVIDIA A10-24A
2172  : NVIDIA A10-3B


kernel message

[  216.593438] kauditd_printk_skb: 116 callbacks suppressed
[  216.593447] audit: type=1400 audit(1762346790.115:128): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libvirt-c0e22088-c99f-4c20-90e8-e67ed37db700" pid=2629 comm="apparmor_parser"
[  216.593793] audit: type=1400 audit(1762346790.115:129): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libvirt-c0e22088-c99f-4c20-90e8-e67ed37db700//passt" pid=2629 comm="apparmor_parser"
[  216.687415] audit: type=1400 audit(1762346790.209:130): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-c0e22088-c99f-4c20-90e8-e67ed37db700" pid=2632 comm="apparmor_parser"
[  216.693254] audit: type=1400 audit(1762346790.215:131): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="libvirt-c0e22088-c99f-4c20-90e8-e67ed37db700//passt" pid=2632 comm="apparmor_parser"
[  216.785302] audit: type=1400 audit(1762346790.307:132): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-c0e22088-c99f-4c20-90e8-e67ed37db700" pid=2636 comm="apparmor_parser"
[  216.791246] audit: type=1400 audit(1762346790.313:133): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="libvirt-c0e22088-c99f-4c20-90e8-e67ed37db700//passt" pid=2636 comm="apparmor_parser"
[  216.811390] nvidia 0000:ca:00.4: Enabling HDA controller
[  216.811515] nvidia 0000:ca:00.4: Enabling HDA controller
[  216.811525] nvidia 0000:ca:00.4: Runtime PM usage count underflow!
[  216.811897] ------------[ cut here ]------------
[  216.811902] WARNING: CPU: 16 PID: 777 at drivers/vfio/group.c:695 vfio_group_find_or_alloc+0xb9/0x1e0 [vfio]
[  216.811918] Modules linked in: vfio_pci pci_pf_stub xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables qrtr bridge stp llc bonding nvidia_vgpu_vfio(OE) cfg80211 binfmt_misc nvidia(POE) intel_rapl_msr intel_rapl_common nls_iso8859_1 intel_uncore_frequency intel_uncore_frequency_common i10nm_edac skx_edac_common nfit x86_pkg_temp_thermal intel_powerclamp coretemp vfio_pci_core mdev kvm_intel vfio_iommu_type1 vfio iommufd cmdlinepart spi_nor mtd kvm dax_hmem ast cxl_acpi ipmi_ssif rapl cxl_port irqbypass intel_cstate cxl_core i2c_algo_bit isst_if_mbox_pci isst_if_mmio intel_th_gth isst_if_common mei_me spi_intel_pci ioatdma intel_th_pci i2c_i801 spi_intel mei i2c_smbus intel_th intel_pch_thermal intel_vsec dca acpi_power_meter ipmi_si acpi_ipmi ipmi_devintf ipmi_msghandler acpi_pad joydev input_leds mac_hid sch_fq_codel dm_multipath msr efi_pstore nfnetlink dmi_sysfs ip_tables x_tables autofs4 btrfs
[  216.811996]  blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 mlx5_ib ib_uverbs macsec ib_core rndis_host cdc_ether usbnet mii hid_generic usbhid hid raid1 nvme nvme_core nvme_auth mlx5_core crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 mlxfw sha1_ssse3 psample tls ahci vmd pci_hyperv_intf xhci_pci libahci xhci_pci_renesas aesni_intel crypto_simd cryptd
[  216.812041] CPU: 16 PID: 777 Comm: kworker/16:2 Tainted: P           OE      6.8.0-87-generic #88-Ubuntu
[  216.812044] Hardware name: Supermicro SYS-220U-TNR/X12DPU-6, BIOS 2.4 08/21/2025
[  216.812046] Workqueue: events work_for_cpu_fn
[  216.812057] RIP: 0010:vfio_group_find_or_alloc+0xb9/0x1e0 [vfio]
[  216.812063] Code: 8b 80 03 00 00 48 8d 42 d8 48 39 d1 75 0f eb 60 48 8b 50 28 48 8d 42 d8 48 39 d1 74 53 4c 3b 28 75 ee 4c 89 f7 e8 77 89 a5 c7 <0f> 0b 48 c7 c3 ea ff ff ff eb 6a 44 0f b6 25 24 29 03 00 41 80 fc
[  216.812066] RSP: 0018:ff6ee0c620027d10 EFLAGS: 00010246
[  216.812069] RAX: 0000000000000000 RBX: ff4a4e2a91809800 RCX: ff4a4e2a91809b80
[  216.812071] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[  216.812072] RBP: ff6ee0c620027d30 R08: 0000000000000000 R09: 0000000000000000
[  216.812073] R10: 0000000000000000 R11: 0000000000000000 R12: ff4a4e6a075c2900
[  216.812075] R13: ff4a4e6a36fbe0c8 R14: ff4a4e2a91809b90 R15: ff4a4e6a36fbe000
[  216.812076] FS:  0000000000000000(0000) GS:ff4a4ea8fee00000(0000) knlGS:0000000000000000
[  216.812078] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  216.812080] CR2: 00007315cbb0e090 CR3: 0000005b8023c005 CR4: 0000000000771ef0
[  216.812082] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  216.812084] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  216.812085] PKRU: 55555554
[  216.812086] Call Trace:
[  216.812089]  <TASK>
[  216.812092]  vfio_device_set_group+0x19/0x40 [vfio]
[  216.812099]  __vfio_register_dev+0x6c/0x140 [vfio]
[  216.812105]  ? __pm_runtime_idle+0x4c/0xd0
[  216.812114]  vfio_register_group_dev+0x10/0x20 [vfio]
[  216.812119]  vfio_pci_core_register_device+0x1b6/0x460 [vfio_pci_core]
[  216.812130]  vfio_pci_probe+0x53/0x140 [vfio_pci]
[  216.812133]  local_pci_probe+0x44/0xb0
[  216.812143]  work_for_cpu_fn+0x17/0x30
[  216.812147]  process_one_work+0x181/0x3a0
[  216.812150]  worker_thread+0x306/0x440
[  216.812153]  ? __pfx_worker_thread+0x10/0x10
[  216.812156]  kthread+0xef/0x120
[  216.812163]  ? __pfx_kthread+0x10/0x10
[  216.812167]  ret_from_fork+0x44/0x70
[  216.812174]  ? __pfx_kthread+0x10/0x10
[  216.812177]  ret_from_fork_asm+0x1b/0x30
[  216.812184]  </TASK>
[  216.812185] ---[ end trace 0000000000000000 ]---
[  216.812217] vfio-pci: probe of 0000:ca:00.4 failed with error -22
[  216.901145] audit: type=1400 audit(1762346790.423:134): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="libvirt-c0e22088-c99f-4c20-90e8-e67ed37db700" pid=2644 comm="apparmor_parser"
[  216.901451] audit: type=1400 audit(1762346790.423:135): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="libvirt-c0e22088-c99f-4c20-90e8-e67ed37db700//passt" pid=2644 comm="apparmor_parser"
[  216.919086] brbond0: port 2(vnet0) entered blocking state
[  216.919100] brbond0: port 2(vnet0) entered disabled state
[  216.919131] vnet0: entered allmulticast mode
[  216.919510] vnet0: entered promiscuous mode
[  216.919975] brbond0: port 2(vnet0) entered blocking state
[  216.919987] brbond0: port 2(vnet0) entered forwarding state
[  217.007756] audit: type=1400 audit(1762346790.529:136): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-c0e22088-c99f-4c20-90e8-e67ed37db700" pid=2654 comm="apparmor_parser"
[  217.013243] audit: type=1400 audit(1762346790.535:137): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="libvirt-c0e22088-c99f-4c20-90e8-e67ed37db700//passt" pid=2654 comm="apparmor_parser"
[  218.557158] BUG: kernel NULL pointer dereference, address: 0000000000000010
[  218.557182] #PF: supervisor read access in kernel mode
[  218.557190] #PF: error_code(0x0000) - not-present page
[  218.557198] PGD 124a4d067 P4D 0
[  218.557206] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  218.557214] CPU: 32 PID: 2658 Comm: qemu-system-x86 Tainted: P        W  OE      6.8.0-87-generic #88-Ubuntu
[  218.557225] Hardware name: Supermicro SYS-220U-TNR/X12DPU-6, BIOS 2.4 08/21/2025
[  218.557234] RIP: 0010:vfio_df_open+0x3e/0x120 [vfio]
[  218.557248] Code: 83 ec 08 4c 8b 2f 41 8b 85 e4 03 00 00 85 c0 75 6f 41 c7 85 e4 03 00 00 01 00 00 00 4c 8b 37 4c 8b 67 28 49 8b 06 48 8b 40 68 <48> 8b 78 10 e8 f9 ec 9f c6 84 c0 0f 84 ab 00 00 00 4d 85 e4 0f 84
[  218.557268] RSP: 0018:ff6ee0c6240d7b18 EFLAGS: 00010246
[  218.557276] RAX: 0000000000000000 RBX: ff4a4e2a96ed9480 RCX: 0000000000000000
[  218.557284] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ff4a4e2a96ed9480
[  218.557292] RBP: ff6ee0c6240d7b40 R08: 0000000000000000 R09: 0000000000000000
[  218.557300] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[  218.557308] R13: ff4a4e2a97d16000 R14: ff4a4e2a97d16000 R15: 0000000096ed9480
[  218.557316] FS:  00007207c6d16f00(0000) GS:ff4a4e68ffa00000(0000) knlGS:0000000000000000
[  218.557325] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  218.557331] CR2: 0000000000000010 CR3: 0000000135b2a005 CR4: 0000000000773ef0
[  218.557339] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  218.557346] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  218.557354] PKRU: 55555554
[  218.557359] Call Trace:
[  218.557364]  <TASK>
[  218.557369]  ? vfio_device_get_kvm_safe+0x57/0xc0 [vfio]
[  218.557382]  vfio_df_group_open+0x9c/0x200 [vfio]
[  218.557392]  vfio_group_ioctl_get_device_fd+0x10f/0x250 [vfio]
[  218.557403]  vfio_group_fops_unl_ioctl+0xff/0x3b0 [vfio]
[  218.557413]  __x64_sys_ioctl+0xa0/0xf0
[  218.557424]  x64_sys_call+0x12a3/0x25a0
[  218.557432]  do_syscall_64+0x7f/0x180
[  218.557443]  ? __symbol_put+0x69/0xa0
[  218.557454]  ? __kmalloc+0x1c0/0x4f0
[  218.557466]  ? task_numa_fault+0x23d/0x3f0
[  218.557475]  ? mpol_misplaced+0x69/0x200
[  218.557483]  ? do_numa_page+0x24d/0x3c0
[  218.557492]  ? handle_pte_fault+0x16e/0x1d0
[  218.557500]  ? __handle_mm_fault+0x654/0x800
[  218.557508]  ? __count_memcg_events+0x6b/0x120
[  218.557516]  ? count_memcg_events.constprop.0+0x2a/0x50
[  218.557524]  ? handle_mm_fault+0xad/0x380
[  218.557531]  ? arch_exit_to_user_mode_prepare.isra.0+0x1a/0xe0
[  218.557540]  ? irqentry_exit_to_user_mode+0x38/0x1e0
[  218.557548]  ? irqentry_exit+0x43/0x50
[  218.557554]  ? clear_bhb_loop+0x15/0x70
[  218.557767]  ? clear_bhb_loop+0x15/0x70
[  218.557954]  ? clear_bhb_loop+0x15/0x70
[  218.558130]  entry_SYSCALL_64_after_hwframe+0x78/0x80
[  218.558304] RIP: 0033:0x7207c7324e1d
[  218.558504] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00
[  218.558870] RSP: 002b:00007ffd8d48c140 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  218.559056] RAX: ffffffffffffffda RBX: 00005f749a48cab0 RCX: 00007207c7324e1d
[  218.559240] RDX: 00005f749a487960 RSI: 0000000000003b6a RDI: 000000000000000a
[  218.559421] RBP: 00007ffd8d48c190 R08: 0000000000000000 R09: 0000000000000007
[  218.559600] R10: 0000000180000000 R11: 0000000000000246 R12: 00000000000000d7
[  218.559773] R13: 00007ffd8d48c1e8 R14: 0000000000000000 R15: 00005f749a485cc0
[  218.559941]  </TASK>
[  218.560102] Modules linked in: vhost_net vhost vhost_iotlb tap vfio_pci pci_pf_stub xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables qrtr bridge stp llc bonding nvidia_vgpu_vfio(OE) cfg80211 binfmt_misc nvidia(POE) intel_rapl_msr intel_rapl_common nls_iso8859_1 intel_uncore_frequency intel_uncore_frequency_common i10nm_edac skx_edac_common nfit x86_pkg_temp_thermal intel_powerclamp coretemp vfio_pci_core mdev kvm_intel vfio_iommu_type1 vfio iommufd cmdlinepart spi_nor mtd kvm dax_hmem ast cxl_acpi ipmi_ssif rapl cxl_port irqbypass intel_cstate cxl_core i2c_algo_bit isst_if_mbox_pci isst_if_mmio intel_th_gth isst_if_common mei_me spi_intel_pci ioatdma intel_th_pci i2c_i801 spi_intel mei i2c_smbus intel_th intel_pch_thermal intel_vsec dca acpi_power_meter ipmi_si acpi_ipmi ipmi_devintf ipmi_msghandler acpi_pad joydev input_leds mac_hid sch_fq_codel dm_multipath msr efi_pstore nfnetlink dmi_sysfs ip_tables
[  218.560151]  x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 mlx5_ib ib_uverbs macsec ib_core rndis_host cdc_ether usbnet mii hid_generic usbhid hid raid1 nvme nvme_core nvme_auth mlx5_core crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 mlxfw sha1_ssse3 psample tls ahci vmd pci_hyperv_intf xhci_pci libahci xhci_pci_renesas aesni_intel crypto_simd cryptd
[  218.562779] CR2: 0000000000000010
[  218.563017] ---[ end trace 0000000000000000 ]---
[  218.631245] RIP: 0010:vfio_df_open+0x3e/0x120 [vfio]
[  218.631515] Code: 83 ec 08 4c 8b 2f 41 8b 85 e4 03 00 00 85 c0 75 6f 41 c7 85 e4 03 00 00 01 00 00 00 4c 8b 37 4c 8b 67 28 49 8b 06 48 8b 40 68 <48> 8b 78 10 e8 f9 ec 9f c6 84 c0 0f 84 ab 00 00 00 4d 85 e4 0f 84
[  218.632046] RSP: 0018:ff6ee0c6240d7b18 EFLAGS: 00010246
[  218.632311] RAX: 0000000000000000 RBX: ff4a4e2a96ed9480 RCX: 0000000000000000
[  218.632578] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ff4a4e2a96ed9480
[  218.632846] RBP: ff6ee0c6240d7b40 R08: 0000000000000000 R09: 0000000000000000
[  218.633115] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[  218.633385] R13: ff4a4e2a97d16000 R14: ff4a4e2a97d16000 R15: 0000000096ed9480
[  218.633656] FS:  00007207c6d16f00(0000) GS:ff4a4e68ffa00000(0000) knlGS:0000000000000000
[  218.633932] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  218.634207] CR2: 0000000000000010 CR3: 0000000135b2a005 CR4: 0000000000773ef0
[  218.634487] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  218.634766] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  218.635044] PKRU: 55555554
[  218.635320] note: qemu-system-x86[2658] exited with irqs disabled

nvidia-smi on a guest (managed='no')

Wed Nov  5 13:02:21 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.95.05              Driver Version: 580.95.05      CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A10-1B                  On  |   00000000:05:00.0 Off |                  N/A |
| N/A   N/A    P0            N/A  /  N/A  |       0MiB /   1024MiB |      0%   Prohibited |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

A vGPU is supported only in unmanaged libvirt mode. Therefore, ensure that in the hostdev element, the managed attribute is set to no.