Complete system freeze after S3 suspend/resume (RTX 5060)

System Information

  • GPU: NVIDIA GeForce RTX 5060 (GB206, rev a1) — PCI 01:00.0
  • Driver: 580.142 (NVIDIA UNIX Open Kernel Module)
  • VBIOS: 98.06.39.80.46
  • OS: Fedora 43 (KDE Plasma Desktop Edition)
  • Kernel: 6.19.10-200.fc43.x86_64
  • CPU: Intel Core i7-14700F
  • Display Server: KDE Plasma / KWin on Wayland
  • Driver packages: akmod-nvidia-580.142-1.fc43.x86_64 (RPM Fusion)

Description

After resuming from S3 suspend on a desktop system, the nvidia-drm kernel driver wedges, causing repeated pageflip timeouts and a complete system freeze. The display never recovers — no TTYs are accessible, and the system becomes entirely unresponsive. Processes using the GPU enter uninterruptible sleep (D state), surviving even SIGKILL. A SysRq REISUB is the only way to reboot.

This is a recurring issue, not a one-off. The freeze happens frequently after waking from suspend, with no apparent pattern as to which application is in use at the time. It does not always happen on the first resume — sometimes the system partially recovers from the initial pageflip timeouts only to freeze hard on a subsequent resume. All recovery attempts described below were performed via SSH from another machine, as the local console was completely unresponsive throughout.

KWin itself identifies this as an nvidia-drm bug and requests that it be reported here:

Pageflip timed out! This is a bug in the nvidia-drm kernel driver
Please report this at https://forums.developer.nvidia.com/c/gpu-graphics/linux

Timeline of Events (from journalctl -b -1)

First suspend/resume — Apr 6, 12:21 (partial recovery)

  1. System enters S3 sleep and resumes at 12:21:42.
  2. Immediately after resume, 21 consecutive “Pageflip timed out!” errors from kwin_wayland, one per second from 12:21:43 to 12:22:03.
  3. Kernel DRM errors:
    [drm:nv_drm_atomic_commit [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Flip event timeout on head 0
    
  4. After ~24 seconds, a late pageflip finally arrives:
    Pageflip arrived after all, 23898ms after the commit
    Atomic modeset commit failed! Permission denied
    Atomic modeset test failed! Permission denied
    Applying output configuration failed!
    
  5. Console falls back to framebuffer device. System partially recovers (services continue running), though the display was not responsive at the time — recovery was only observable via SSH.

Second suspend/resume — Apr 7, 15:37 (complete freeze, requires REISUB)

  1. System enters S3 and resumes at 15:37:55. Resume appears to complete normally at the kernel level, but the display never comes back — the screen remains frozen.
  2. Via SSH from a laptop, at 15:41:52, attempting sudo rmmod nvidia_uio nvidia_drm nvidia_modeset nvidia has no effect (modules in use / driver wedged).
  3. Via SSH, at 15:43:33, attempting sudo systemctl restart gdm to recover the display.
  4. SDDM terminates, but multiple KDE/Plasma processes become unkillable:
    15:47:07 — SIGTERM times out for: polkit-kde-auth, yakuake, baloorunner, kalendarac, xwaylandvideobridge, DiscoverNotifier, krunner
    15:47:13 — SIGABRT sent — no effect
    15:47:18 — SIGKILL sent — "Processes still around after SIGKILL. Ignoring."
    15:47:23 — Final SIGTERM/SIGABRT — no effect
    15:47:28 — Final SIGKILL — no effect
    15:47:33 — "Processes still around after final SIGKILL. Entering failed mode."
    
    These processes are stuck in uninterruptible sleep (D state) in the nvidia driver, likely blocked on a futex in libEGL_nvidia.so.0.
  5. At 15:52:18, plasmashell crashes.
  6. At 15:55:19, during the (failed) shutdown sequence:
    [drm] [nvidia-drm] [GPU ID 0x00000100] nv_drm_reset_input_colorspace failed with error code: -4 !
    
  7. At 15:55:21, the only way to reboot is via SysRq Emergency Sync (REISUB):
    kernel: sysrq: Emergency Sync
    

Additional observations

  • A WARNING is logged during nvidia-drm initialization at boot:
    WARNING: drivers/gpu/drm/drm_mode_config.c:578 at drm_mode_config_cleanup+0x336/0x350
    RIP: 0010:drm_mode_config_cleanup+0x336/0x350
    Call trace includes: nv_drm_register_drm_device.cold+0x83/0x10d [nvidia_drm]
    
  • Stack traces from the stuck EGL threads show them blocked in:
    pthread_cond_wait → libEGL_nvidia.so.0+0xc1ccc → libEGL_nvidia.so.0+0x8da29 → libEGL_nvidia.so.0+0xc79fe
    
  • The egl-wayland library version is 1.1.21 (libnvidia-egl-wayland.so.1), and crash traces show calls through wlEglSendDamageEvent and wlEglSwapBuffersWithDamageHook.

Steps to Reproduce

  1. Install Fedora 43 KDE with NVIDIA 580.142 drivers on an RTX 5060 (desktop system), running KDE Plasma on Wayland.
  2. Use the system normally. No specific application is required — the freeze is not correlated with any particular workload.
  3. Suspend to S3.
  4. Resume from suspend.
  5. The display freezes. The system may partially recover from the initial pageflip timeouts (as on Apr 6), or it may freeze completely and never recover (as on Apr 7). In either case, the display is unresponsive for an extended period.
  6. This issue is highly reproducible — it occurs frequently after suspend/resume, though not 100% of the time.

Expected Behavior

The system should resume from S3 suspend without the nvidia-drm driver wedging. Pageflips should complete promptly, and the driver should not enter a state that causes processes to become unkillable.

Attachments

For what it’s worth, disabling my display’s “Deep Sleep Mode” feature (LG 32gs95ue) fixed my freezing on wake.