nvidia-bug-report.log.gz (386.1 KB)
I’m on gentoo running
- GeForce GTX 960 in slot 1
- GeForce GTX 950 in slot 2
I have two cards because I use one for VFIO/GPU passthrough (using the ACS override patch).
Things work normally on Linux kernel 5.4.60 using nvidia drivers 460.91.03-r2.
However, over the past several months, I have attempted to try upgrading both the kernel and the drivers, but have had no luck. Any combination of the following has triggered a crash upon start-up:
- Kernels 5.10, 5.12, 5.15
- Nvidia drivers 460.x, 470.x, 495.x
- ACS override patch on or off
Here’s a snippet of the boot log running kernel 5.15, driver 470.x, ACS overrride disabled
Jan 01 16:33:42 [kernel] [ 20.958863] nvidia: loading out-of-tree module taints kernel.
- Last output repeated twice -
Jan 01 16:33:42 [kernel] [ 20.958871] nvidia: module license 'NVIDIA' taints kernel.
Jan 01 16:33:42 [kernel] [ 20.958937] nvidia: module license 'NVIDIA' taints kernel.
Jan 01 16:33:42 [kernel] [ 20.958995] Disabling lock debugging due to kernel taint
Jan 01 16:33:42 [kernel] [ 20.972276] nvidia: module verification failed: signature and/or required key missing - tainting kernel
Jan 01 16:33:42 [kernel] [ 20.983866] nvidia-nvlink: Nvlink Core is being initialized, major device number 243
Jan 01 16:33:42 [kernel] [ 20.983944]
Jan 01 16:33:42 [kernel] [ 20.984656] nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem
Jan 01 16:33:42 [kernel] [ 21.184278] NVRM: The NVIDIA probe routine was not called for 1 device(s).
Jan 01 16:33:42 [kernel] [ 21.184376] NVRM: This can occur when a driver such as:
Jan 01 16:33:42 [kernel] [ 21.184376] NVRM: nouveau, rivafb, nvidiafb or rivatv
Jan 01 16:33:42 [kernel] [ 21.184376] NVRM: was loaded and obtained ownership of the NVIDIA device(s).
Jan 01 16:33:42 [kernel] [ 21.184470] NVRM: Try unloading the conflicting kernel module (and/or
Jan 01 16:33:42 [kernel] [ 21.184470] NVRM: reconfigure your kernel without the conflicting
Jan 01 16:33:42 [kernel] [ 21.184470] NVRM: driver(s)), then try loading the NVIDIA kernel module
Jan 01 16:33:42 [kernel] [ 21.184470] NVRM: again.
Jan 01 16:33:42 [kernel] [ 21.184583] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 495.46 Wed Oct 27 16:31:33 UTC 2021
Jan 01 16:33:42 [kernel] [ 21.339294] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 495.46 Wed Oct 27 16:22:48 UTC 2021
Jan 01 16:33:42 [kernel] [ 21.409697] [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
Jan 01 16:33:42 [kernel] [ 21.410151] pmd_set_huge: Cannot satisfy [mem 0xf6000000-0xf6200000] with a huge-page mapping due to MTRR override.
Jan 01 16:33:42 [kernel] [ 21.424758] resource sanity check: requesting [mem 0x000e0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000e0000-0x000e3fff window]
Jan 01 16:33:42 [kernel] [ 21.424836] caller _nv032275rm+0x2a/0x60 [nvidia] mapping multiple BARs
Jan 01 16:33:42 [kernel] [ 21.577405] resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000d0000-0x000d3fff window]
Jan 01 16:33:42 [kernel] [ 21.577482] caller _nv000717rm+0x1ad/0x200 [nvidia] mapping multiple BARs
Jan 01 16:33:42 [kernel] [ 22.293655] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 0
Jan 01 16:33:42 [kernel] [ 32.709826] ext2 filesystem being mounted at /boot supports timestamps until 2038 (0x7fffffff)
Jan 01 16:33:42 [kernel] [ 32.794307] EXT4-fs (dm-4): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
Jan 01 16:33:42 [kernel] [ 32.867490] SGI XFS with ACLs, security attributes, quota, no debug enabled
Jan 01 16:33:42 [kernel] [ 32.868397] XFS (dm-6): Mounting V5 Filesystem
Jan 01 16:33:42 [kernel] [ 33.088618] XFS (dm-6): Ending clean mount
Jan 01 16:33:42 [kernel] [ 33.089936] xfs filesystem being mounted at /data1 supports timestamps until 2038 (0x7fffffff)
Jan 01 16:33:42 [kernel] [ 33.163333] EXT4-fs (dm-8): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
Jan 01 16:33:42 [kernel] [ 33.200724] EXT4-fs (dm-3): re-mounted. Opts: . Quota mode: none.
Jan 01 16:33:42 [kernel] [ 33.381467] Adding 10485756k swap on /dev/mapper/vg1-swap. Priority:-2 extents:1 across:10485756k
Jan 01 16:33:43 [kernel] [ 40.202600] #PF: error_code(0x0000) - not-present page
Jan 01 16:33:43 [kernel] [ 40.202666] PGD 0 P4D 0
Jan 01 16:33:43 [kernel] [ 40.202732] Oops: 0000 [#1] SMP PTI
Jan 01 16:33:43 [kernel] [ 40.202797] CPU: 5 PID: 3726 Comm: X Tainted: P OE 5.15.11-gentoo-x86_64 #1
Jan 01 16:33:43 [kernel] [ 40.202873] Hardware name: MSI MS-7922/ Z97S SLI Krait Edition (MS-7922), BIOS V10.7 02/16/2016
Jan 01 16:33:43 [kernel] [ 40.202948] RIP: 0010:nv_audio_dynamic_power+0xb4/0x120 [nvidia]
Jan 01 16:33:43 [kernel] [ 40.203154] Code: 01 00 00 48 85 d2 74 9a 48 8b 82 a8 01 00 00 48 81 c2 a0 01 00 00 48 39 d0 75 0f eb 85 48 8b 40 08 48 39 d0 0f 84 78 ff ff ff <83> 78 1c 03 75 ed 48 8b 78 20 48 8b 87 30 03 00 00 48 85 ff 0f 84
Jan 01 16:33:43 [kernel] [ 40.203243] RSP: 0018:ffffc900021c7490 EFLAGS: 00010207
Jan 01 16:33:43 [kernel] [ 40.203309] RAX: 0000000000000000 RBX: ffff88811125f080 RCX: 0000000000000002
Jan 01 16:33:43 [kernel] [ 40.203377] RDX: ffff888106adfda0 RSI: 0000000000000000 RDI: ffff888101557108
Jan 01 16:33:43 [kernel] [ 40.203444] RBP: ffff88811125ef70 R08: ffff888101271ca0 R09: 0000000000000000
Jan 01 16:33:43 [kernel] [ 40.203511] R10: ffffffffa125d2a0 R11: ffffc9000015d008 R12: ffff88811125efb8
Jan 01 16:33:43 [kernel] [ 40.203579] R13: ffffffffa332b180 R14: ffff88810afc4020 R15: 0000000000000000
Jan 01 16:33:43 [kernel] [ 40.203645] FS: 00007fee4397f8c0(0000) GS:ffff88881ed40000(0000) knlGS:0000000000000000
Jan 01 16:33:43 [kernel] [ 40.203719] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 01 16:33:43 [kernel] [ 40.203785] CR2: 000000000000001c CR3: 0000000109bb8006 CR4: 00000000001706e0
Jan 01 16:33:43 [kernel] [ 40.203853] Call Trace:
Jan 01 16:33:43 [kernel] [ 40.203918] <TASK>
Jan 01 16:33:43 [kernel] [ 40.203982] ? _nv035225rm+0x3b/0x150 [nvidia]
Jan 01 16:33:43 [kernel] [ 40.204177] _nv037976rm+0x25/0x30 [nvidia]
Jan 01 16:33:43 [kernel] [ 40.204396] ? _nv000790rm+0x70/0x70 [nvidia]
Jan 01 16:33:43 [kernel] [ 40.204610] ? _nv034973rm+0x18c/0x1a0 [nvidia]
Jan 01 16:33:43 [kernel] [ 40.204860] ? _nv036712rm+0x265/0x2c0 [nvidia]
Jan 01 16:33:43 [kernel] [ 40.205112] ? _nv014671rm+0x76e/0x920 [nvidia]
Jan 01 16:33:43 [kernel] [ 40.205308] ? _nv035105rm+0x53/0x170 [nvidia]
Jan 01 16:33:43 [kernel] [ 40.205492] ? _nv019170rm+0x842/0xc90 [nvidia]
Jan 01 16:33:43 [kernel] [ 40.205676] ? _nv019170rm+0xc86/0xc90 [nvidia]
Jan 01 16:33:43 [kernel] [ 40.205861] ? _nv019170rm+0xc6f/0xc90 [nvidia]
Jan 01 16:33:43 [kernel] [ 40.206045] ? rm_kernel_rmapi_op+0x141/0x190 [nvidia]
Jan 01 16:33:43 [kernel] [ 40.206263] ? nvkms_call_rm+0x4b/0x80 [nvidia_modeset]
Jan 01 16:33:43 [kernel] [ 40.206340] ? _nv002514kms+0x51/0x60 [nvidia_modeset]
Jan 01 16:33:43 [kernel] [ 40.206417] ? _raw_spin_lock_irqsave+0x32/0x50
Jan 01 16:33:43 [kernel] [ 40.206486] ? _nv002155kms+0x5cf/0x9e0 [nvidia_modeset]
Jan 01 16:33:43 [kernel] [ 40.206561] ? trace_hardirqs_on+0x2b/0xb0
Jan 01 16:33:43 [kernel] [ 40.206627] ? _raw_spin_unlock_irqrestore+0x16/0x20
Jan 01 16:33:43 [kernel] [ 40.206694] ? nv_init_msi+0xcc/0xf0 [nvidia]
Jan 01 16:33:43 [kernel] [ 40.206836] ? _nv002376kms+0x12b/0x510 [nvidia_modeset]
Jan 01 16:33:43 [kernel] [ 40.206912] ? nvkms_call_rm+0x5b/0x80 [nvidia_modeset]
Jan 01 16:33:43 [kernel] [ 40.206983] ? _nv002514kms+0x51/0x60 [nvidia_modeset]
Jan 01 16:33:43 [kernel] [ 40.207060] ? _nv002607kms+0x20b/0xce0 [nvidia_modeset]
Jan 01 16:33:43 [kernel] [ 40.207137] ? _nv002569kms+0x2796/0x2e40 [nvidia_modeset]
Jan 01 16:33:43 [kernel] [ 40.207213] ? finish_task_switch.isra.0+0xa8/0x260
Jan 01 16:33:43 [kernel] [ 40.207281] ? get_page_from_freelist+0xc5/0x3b0
Jan 01 16:33:43 [kernel] [ 40.207348] ? trace_hardirqs_on+0x2b/0xb0
Jan 01 16:33:43 [kernel] [ 40.207414] ? _nv002584kms+0x19a/0x760 [nvidia_modeset]
Jan 01 16:33:43 [kernel] [ 40.207490] ? trace_hardirqs_on+0x2b/0xb0
Jan 01 16:33:43 [kernel] [ 40.207555] ? nv_kthread_q_stop+0x2240/0x2d30 [nvidia_modeset]
Jan 01 16:33:43 [kernel] [ 40.207628] ? _nv002333kms+0x11a/0x230 [nvidia_modeset]
Jan 01 16:33:43 [kernel] [ 40.207701] ? trace_hardirqs_on+0x2b/0xb0
Jan 01 16:33:43 [kernel] [ 40.207767] ? kfree+0xb3/0x160
Jan 01 16:33:43 [kernel] [ 40.207833] ? _nv002054kms+0x18b3/0x2710 [nvidia_modeset]
Jan 01 16:33:43 [kernel] [ 40.207912] ? trace_hardirqs_on+0x2b/0xb0
Jan 01 16:33:43 [kernel] [ 40.207977] ? _nv002054kms+0x18b3/0x2710 [nvidia_modeset]
Jan 01 16:33:43 [kernel] [ 40.208056] ? _nv002306kms+0x426/0x1330 [nvidia_modeset]
Jan 01 16:33:43 [kernel] [ 40.208149] ? trace_hardirqs_on+0x2b/0xb0
Jan 01 16:33:43 [kernel] [ 40.208237] ? kfree+0xb3/0x160
Jan 01 16:33:43 [kernel] [ 40.208322] ? nv_kthread_q_stop+0x2188/0x2d30 [nvidia_modeset]
Jan 01 16:33:43 [kernel] [ 40.208394] ? nv_kthread_q_stop+0x2216/0x2d30 [nvidia_modeset]
Jan 01 16:33:43 [kernel] [ 40.208467] ? nvKmsIoctl+0x96/0x1d0 [nvidia_modeset]
Jan 01 16:33:43 [kernel] [ 40.208539] ? nvkms_ioctl_from_kapi+0x4c/0x90 [nvidia_modeset]
Jan 01 16:33:43 [kernel] [ 40.208611] ? _nv002054kms+0x36c/0x2710 [nvidia_modeset]
Jan 01 16:33:43 [kernel] [ 40.208689] ? nv_drm_exit+0xde/0x350 [nvidia_drm]
Jan 01 16:33:43 [kernel] [ 40.208756] ? _nv002054kms+0x330/0x2710 [nvidia_modeset]
Jan 01 16:33:43 [kernel] [ 40.209070] ? __fput+0x94/0x250
Jan 01 16:33:43 [kernel] [ 40.214960] ? task_work_run+0x61/0x90
Jan 01 16:33:43 [kernel] [ 40.215026] ? exit_to_user_mode_loop+0x133/0x140
Jan 01 16:33:43 [kernel] [ 40.215093] ? exit_to_user_mode_prepare+0x8d/0xa0
Jan 01 16:33:43 [kernel] [ 40.215159] ? syscall_exit_to_user_mode+0x27/0x50
Jan 01 16:33:43 [kernel] [ 40.215245] ? do_syscall_64+0x48/0xc0
Jan 01 16:33:43 [kernel] [ 40.215310] ? entry_SYSCALL_64_after_hwframe+0x44/0xae
Jan 01 16:33:43 [kernel] [ 40.215378] </TASK>
Jan 01 16:33:43 [kernel] [ 40.215442] Modules linked in: xfs ext2 nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) x86_pkg_temp_thermal coretemp drm_kms_helper kvm_intel snd_hda_codec_realtek snd_hda_codec_generic kvm ledtrig_audio drm snd_hda_intel syscopyarea ppdev sysfillrect mxm_wmi snd_intel_dspcfg at24 regmap_i2c snd_hda_codec iTCO_wdt iTCO_vendor_support i2c_i801 snd_hda_core ghash_clmulni_intel sysimgblt lpc_ich i2c_smbus fb_sys_fops snd_hwdep i2c_core mfd_core pcspkr snd_pcm parport_pc parport wmi video btrfs blake2b_generic xor zstd_compress hci_vhci ppp_generic slhc bluetooth vhost_net snd_seq tun snd_seq_device snd_timer cuse vhost ecdh_generic vhost_iotlb fuse tap ecc snd raid6_pq autofs4 soundcore rfkill nvram ext4 mbcache jbd2 dm_crypt encrypted_keys sd_mod t10_pi xhci_pci crc32c_intel ahci r8169 libahci xhci_hcd realtek
Jan 01 16:33:43 [kernel] [ 40.215646] CR2: 000000000000001c
Jan 01 16:33:43 [kernel] [ 40.215713] ---[ end trace b49938411775b3a9 ]---
Jan 01 16:33:43 [kernel] [ 40.215797] RIP: 0010:nv_audio_dynamic_power+0xb4/0x120 [nvidia]
Jan 01 16:33:43 [kernel] [ 40.215937] Code: 01 00 00 48 85 d2 74 9a 48 8b 82 a8 01 00 00 48 81 c2 a0 01 00 00 48 39 d0 75 0f eb 85 48 8b 40 08 48 39 d0 0f 84 78 ff ff ff <83> 78 1c 03 75 ed 48 8b 78 20 48 8b 87 30 03 00 00 48 85 ff 0f 84
Jan 01 16:33:43 [kernel] [ 40.216026] RSP: 0018:ffffc900021c7490 EFLAGS: 00010207
Jan 01 16:33:43 [kernel] [ 40.216092] RAX: 0000000000000000 RBX: ffff88811125f080 RCX: 0000000000000002
Jan 01 16:33:43 [kernel] [ 40.216159] RDX: ffff888106adfda0 RSI: 0000000000000000 RDI: ffff888101557108
Jan 01 16:33:43 [kernel] [ 40.216245] RBP: ffff88811125ef70 R08: ffff888101271ca0 R09: 0000000000000000
Jan 01 16:33:43 [kernel] [ 40.216331] R10: ffffffffa125d2a0 R11: ffffc9000015d008 R12: ffff88811125efb8
Jan 01 16:33:43 [kernel] [ 40.216399] R13: ffffffffa332b180 R14: ffff88810afc4020 R15: 0000000000000000
Jan 01 16:33:43 [kernel] [ 40.216466] FS: 00007fee4397f8c0(0000) GS:ffff88881ed40000(0000) knlGS:0000000000000000
Jan 01 16:33:43 [kernel] [ 40.216541] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 01 16:33:43 [kernel] [ 40.216607] CR2: 000000000000001c CR3: 0000000109bb8006 CR4: 00000000001706e0