Bug Report: Kernel NULL pointer dereference in nvidia_modeset during Thunderbolt dock disconnect (multiple monitors only)
Summary
Disconnecting a Thunderbolt 4 dock causes a kernel NULL pointer dereference in nvidia_modeset during drm_atomic_commit, crashing the sway compositor and freezing the system.
The crash only occurs when 2+ external monitors are connected. With a single external monitor, hot-unplug works correctly and sway gracefully falls back to the internal display.
This suggests a bug in the atomic commit path when disabling multiple CRTCs/connectors simultaneously during hot-unplug.
Hardware
-
Laptop: Lenovo ThinkPad X1 Extreme Gen 2 (20QV001GMX)
-
GPU: NVIDIA GeForce GTX 1650 Mobile / Max-Q [10de:1f91] (Turing)
-
iGPU: Intel UHD Graphics 630
-
Dock: Lenovo ThinkPad Thunderbolt 4 Dock (40B0)
-
Thunderbolt Controller: Intel JHL7540 (Titan Ridge)
-
External Displays: Samsung S32D850 (2560x1440), Samsung S34CG50 (3440x1440) via dock DP ports
Software
-
OS: NixOS 25.11
-
Compositor: sway (wlroots-based Wayland compositor)
-
Display Configuration: Hybrid GPU setup using
WLR_DRM_DEVICES=/dev/dri/igpu:/dev/dri/dgpu
Versions Tested (all exhibit the crash)
NVIDIA Drivers:
-
580.119.02 (stable/production)
-
590.48.01 (latest/beta)
-
Both open and closed kernel modules
Linux Kernels:
-
6.6.x (LTS)
-
6.12.64
-
6.18.4 (latest)
Steps to Reproduce
-
Boot system with Thunderbolt dock connected
-
Connect two or more external monitors to the dock (e.g., one HDMI, one DisplayPort)
-
Login to sway compositor (external monitors work correctly)
-
Physically disconnect the Thunderbolt dock cable
-
System freezes immediately
Does not crash when:
- Only one external monitor is connected to the dock. In this case, hot-unplug works correctly and sway falls back to eDP-1
Behavior
Kernel NULL pointer dereference in nvidia_modeset, killing the sway process and freezing the display. System requires hard reboot.
Kernel Oops (Driver 590.48.01, Kernel 6.12.64)
BUG: kernel NULL pointer dereference, address: 0000000000000409
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
Oops: Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 2 UID: 1000 PID: 1592 Comm: sway Tainted: P O 6.12.64 #1-NixOS
Hardware name: LENOVO 20QV001GMX/20QV001GMX, BIOS N2OET69W (1.56 ) 12/02/2025
RIP: 0010:_nv000778kms+0x4/0x10 [nvidia_modeset]
Call Trace:
<TASK>
_nv001339kms+0x94/0x180 [nvidia_modeset]
_nv001293kms+0x25f/0x520 [nvidia_modeset]
_nv001254kms+0xb4/0x202 [nvidia_modeset]
_nv001271kms+0x9d/0x180 [nvidia_modeset]
_nv002592kms+0x45f/0x7d0 [nvidia_modeset]
_nv000725kms+0x1a6/0x620 [nvidia_modeset]
? _nv003066kms+0x74/0x140 [nvidia_modeset]
_nv003103kms+0x9a1/0x45e0 [nvidia_modeset]
nvKmsIoctl+0xf7/0x270 [nvidia_modeset]
nvkms_ioctl_from_kapi_try_pmlock+0x60/0xa0 [nvidia_modeset]
_nv000022kms+0x33e/0xbb0 [nvidia_modeset]
nv_drm_atomic_apply_modeset_config+0x709/0x7b0 [nvidia_drm]
drm_atomic_check_only+0x5f3/0xa10
drm_atomic_commit+0x69/0xe0
drm_mode_atomic_ioctl+0xaff/0xd70
drm_ioctl_kernel+0xad/0x100
drm_ioctl+0x2b0/0x520
__x64_sys_ioctl+0x91/0xd0
do_syscall_64+0xae/0x200
</TASK>
CR2: 0000000000000409
note: sway[1592] exited with irqs disabled
Kernel Oops (Driver 580.119.02) - Same crash, different symbol
RIP: 0010:_nv000899kms+0x4/0x10 [nvidia_modeset]
Same call trace through nv_drm_atomic_apply_modeset_config.
Configuration Options Tested (none helped)
-
hardware.nvidia.open = true/false -
hardware.nvidia.powerManagement.enable = true/false -
hardware.nvidia.nvidiaPersistenced = true -
hardware.nvidia.forceFullCompositionPipeline = true -
boot.kernelParams = ["pcie_aspm=off" "pcie_port_pm=off"] -
services.hardware.bolt.enable = true -
Loading nvidia modules in initrd
-
dGPU-only mode (no hybrid), with and without discrete graphics mode.
-
Various
NVreg_*module parameters
dmesg Context (events leading to crash)
pcieport 0000:05:04.0: pciehp: Slot(4): Link Down
pcieport 0000:05:04.0: pciehp: Slot(4): Card not present
xhci_hcd 0000:2f:00.0: remove, state 1
usb usb6: USB disconnect, device number 1
[... USB teardown ...]
pci_bus 0000:2e: busn_res: [bus 2e-51] is released
BUG: kernel NULL pointer dereference, address: 0000000000000409
nvidia-bug-report.sh
nvidia-bug-report_before_freeze.log.gz (476.3 KB)
It is not possible to run sudo nvidia-bug-report.sh after it freezes, not even through ssh.
Workaround
None or accept a hard reboot.