Use-after-free on GTX 1650 dGPU with 545.29.06 on Fedora 39 + Wayland

Today my notebook (AMD Renoir iGPU + Nvidia GTX 1650 dGPU) did not resume from suspend. After a hard reboot, I checked journalctl for errors, and found this (not sure it is directly related, though):

dez 19 09:01:46 fedoracosta kernel: ==================================================================
dez 19 09:01:46 fedoracosta kernel: Hardware name: Acer Nitro AN515-44/Stonic_RNS, BIOS V1.04 02/04/2021
dez 19 09:01:46 fedoracosta kernel: CPU: 8 PID: 1655 Comm: gnome-shell Tainted: P           OE      6.6.6-200.fc39.x86_64 #1
dez 19 09:01:46 fedoracosta kernel: 
dez 19 09:01:46 fedoracosta kernel:  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
dez 19 09:01:46 fedoracosta kernel:  do_syscall_64+0x60/0x90
dez 19 09:01:46 fedoracosta kernel:  __x64_sys_ioctl+0x97/0xd0
dez 19 09:01:46 fedoracosta kernel:  nvidia_unlocked_ioctl+0x6ee/0x8f0 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  rm_ioctl+0x58/0xb0 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  _nv000719rm+0x1b7/0xe70 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  _nv000566rm+0x4d/0x60 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  _nv044074rm+0x41/0x70 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  _nv044073rm+0xdd/0x180 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  _nv045924rm+0x3e5/0x690 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  _nv045925rm+0xac/0x130 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  _nv044171rm+0xab/0xe0 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  _nv016482rm+0x51c/0x620 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  _nv004237rm+0x1e/0xb0 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  _nv002632rm+0xd/0x20 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  _nv035565rm+0x6b/0x130 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  _nv035601rm+0xca/0x430 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  _nv040305rm+0x67/0xd0 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  nv_dma_release_sgt+0x49/0x70 [nvidia]
dez 19 09:01:46 fedoracosta kernel: freed by task 1655 on cpu 8 at 18.874594s:
dez 19 09:01:46 fedoracosta kernel: 
dez 19 09:01:46 fedoracosta kernel:  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
dez 19 09:01:46 fedoracosta kernel:  do_syscall_64+0x60/0x90
dez 19 09:01:46 fedoracosta kernel:  __x64_sys_ioctl+0x97/0xd0
dez 19 09:01:46 fedoracosta kernel:  drm_ioctl+0x26d/0x4b0
dez 19 09:01:46 fedoracosta kernel:  drm_ioctl_kernel+0xcd/0x170
dez 19 09:01:46 fedoracosta kernel:  drm_prime_fd_to_handle_ioctl+0xf7/0x200
dez 19 09:01:46 fedoracosta kernel:  drm_gem_prime_import_dev+0x87/0x140
dez 19 09:01:46 fedoracosta kernel:  nv_drm_gem_prime_import_sg_table+0x2d/0xb0 [nvidia_drm]
dez 19 09:01:46 fedoracosta kernel: allocated by task 1655 on cpu 6 at 18.658148s:
dez 19 09:01:46 fedoracosta kernel: kfence-#154: 0x000000008eda53aa-0x00000000c63cc6c9, size=384, cache=kmalloc-512
dez 19 09:01:46 fedoracosta kernel: 
dez 19 09:01:46 fedoracosta kernel:  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
dez 19 09:01:46 fedoracosta kernel:  do_syscall_64+0x60/0x90
dez 19 09:01:46 fedoracosta kernel:  __x64_sys_ioctl+0x97/0xd0
dez 19 09:01:46 fedoracosta kernel:  nvidia_unlocked_ioctl+0x6ee/0x8f0 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  rm_ioctl+0x58/0xb0 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  _nv000719rm+0x1b7/0xe70 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  _nv000566rm+0x4d/0x60 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  _nv044074rm+0x41/0x70 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  _nv044073rm+0xdd/0x180 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  _nv045924rm+0x3e5/0x690 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  _nv045925rm+0xac/0x130 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  _nv044171rm+0xab/0xe0 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  _nv016482rm+0x51c/0x620 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  _nv004237rm+0x1e/0xb0 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  _nv002632rm+0xd/0x20 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  _nv035565rm+0x6b/0x130 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  _nv035601rm+0xca/0x430 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  _nv040305rm+0x67/0xd0 [nvidia]
dez 19 09:01:46 fedoracosta kernel:  nv_dma_release_sgt+0x29/0x70 [nvidia]
dez 19 09:01:46 fedoracosta kernel: Use-after-free read at 0x000000007342f808 (in kfence-#154):
dez 19 09:01:46 fedoracosta kernel: BUG: KFENCE: use-after-free read in nv_dma_release_sgt+0x29/0x70 [nvidia]
==================================================================

nvidia-bug-report.log.gz (131.2 KB)

2 Likes

I have the exact same issue with Fedora 39 running default Gnome 45 on Wayland using the 545.29.06 driver.
Basically any screen connected to the NVIDIA dGPU works when I’m at the login screen and remains working after log on.
However when I try to connect a screen to the NVIDIA dGPU after I’ve logged in, the screen will remain black and I’ll get the same error as the OP:
This means I have to keep my external screen turned on forever from the moment it lights up at the login screen.
I cannot suspend my laptop, switch inputs on the monitor or have it turned off by my screensaver.
Basically anything that causes the screen to disconnect and reconnect will trigger the kernel panic and turns the screen black.
When I go back to the login screen it will light up again (so I’ll have to lose all of my work everytime this happens…)

[76732.070625] ==================================================================
[76732.070627] BUG: KFENCE: use-after-free read in nv_dma_release_sgt+0x29/0x70 [nvidia]

[76732.070837] Use-after-free read at 0x000000009058acab (in kfence-#87):
[76732.070838] nv_dma_release_sgt+0x29/0x70 [nvidia]
[76732.071017] _nv040305rm+0x67/0xd0 [nvidia]
[76732.071232] _nv035601rm+0xc7/0x430 [nvidia]
[76732.071575] _nv035565rm+0x6b/0x130 [nvidia]
[76732.071799] _nv002632rm+0xd/0x20 [nvidia]
[76732.072093] _nv004237rm+0x1b/0xb0 [nvidia]
[76732.072355] _nv016482rm+0x51c/0x620 [nvidia]
[76732.072631] _nv044171rm+0xab/0xe0 [nvidia]
[76732.072842] _nv045925rm+0xa9/0x130 [nvidia]
[76732.073119] _nv045924rm+0x3e5/0x690 [nvidia]
[76732.073390] _nv044073rm+0xdd/0x180 [nvidia]
[76732.073605] _nv044074rm+0x41/0x70 [nvidia]
[76732.073813] _nv000566rm+0x4a/0x60 [nvidia]
[76732.074029] _nv000719rm+0x1b7/0xe70 [nvidia]
[76732.074235] rm_ioctl+0x58/0xb0 [nvidia]
[76732.074438] nvidia_unlocked_ioctl+0x6ee/0x8f0 [nvidia]
[76732.074633] __x64_sys_ioctl+0x94/0xd0
[76732.074636] do_syscall_64+0x5d/0x90
[76732.074639] entry_SYSCALL_64_after_hwframe+0x6e/0xd8

[76732.074642] kfence-#87: 0x00000000fc47053a-0x00000000b44a507e, size=384, cache=kmalloc-512

[76732.074644] allocated by task 4851 on cpu 7 at 76732.056968s:
[76732.074649] nv_drm_gem_prime_import_sg_table+0x2d/0xb0 [nvidia_drm]
[76732.074656] drm_gem_prime_import_dev+0x84/0x140
[76732.074658] drm_prime_fd_to_handle_ioctl+0xf4/0x200
[76732.074660] drm_ioctl_kernel+0xca/0x170
[76732.074662] drm_ioctl+0x26d/0x4b0
[76732.074663] __x64_sys_ioctl+0x94/0xd0
[76732.074665] do_syscall_64+0x5d/0x90
[76732.074666] entry_SYSCALL_64_after_hwframe+0x6e/0xd8

[76732.074668] freed by task 4851 on cpu 7 at 76732.070619s:
[76732.076586] nv_dma_release_sgt+0x49/0x70 [nvidia]
[76732.076780] _nv040305rm+0x67/0xd0 [nvidia]
[76732.076982] _nv035601rm+0xc7/0x430 [nvidia]
[76732.077323] _nv035565rm+0x6b/0x130 [nvidia]
[76732.077528] _nv002632rm+0xd/0x20 [nvidia]
[76732.077776] _nv004237rm+0x1b/0xb0 [nvidia]
[76732.078036] _nv016482rm+0x51c/0x620 [nvidia]
[76732.078286] _nv044171rm+0xab/0xe0 [nvidia]
[76732.078489] _nv045925rm+0xa9/0x130 [nvidia]
[76732.078753] _nv045924rm+0x3e5/0x690 [nvidia]
[76732.079019] _nv044073rm+0xdd/0x180 [nvidia]
[76732.079217] _nv044074rm+0x41/0x70 [nvidia]
[76732.079409] _nv000566rm+0x4a/0x60 [nvidia]
[76732.079603] _nv000719rm+0x1b7/0xe70 [nvidia]
[76732.079803] rm_ioctl+0x58/0xb0 [nvidia]
[76732.080004] nvidia_unlocked_ioctl+0x6ee/0x8f0 [nvidia]
[76732.080191] __x64_sys_ioctl+0x94/0xd0
[76732.080194] do_syscall_64+0x5d/0x90
[76732.080196] entry_SYSCALL_64_after_hwframe+0x6e/0xd8

[76732.080199] CPU: 7 PID: 4851 Comm: gnome-shell Tainted: P B O 6.6.6-200.fc39.x86_64 #1
[76732.080201] Hardware name: Razer Blade 15 Advanced Model (Mid 2021) - RZ09-0409/CH570, BIOS 2.02 11/12/2021
[76732.080202] ==================================================================

1 Like

Nice report, you have a more concrete case for debugging. Let’s hope this picks up some attention.

1 Like

Hi just wanted to clarify that the black screen when the monitor disconnects and reconnects was actually caused by an unofficial patch for Gnome’s mutter.

I think the kernel panic which I was experiencing when a connecting a new screen is solved by applying this patch:

At least it didn’t happen so far after applying the patch.

this still happends with 555.52.04 where said patch is already in