Random crashes on wayland since driver 580 on Linux

General information:

Distribution: Arch Linux

Kernel: 6.16.7.arch1-1

Driver 580.82.09-1, nvidia-dkms package (not the nvidia-open driver)

GPU: RTX 3070 Laptop (110w)


Since driver 580, I’ve hit a regression that after about 1 hour of activity on any Wayland-based interface (tested with hyprland and sway), my laptop will freeze without further logs to the point I have to force poweroff by holding the power button. Nothing on journalctl -b -1

The 1 hour mark is an average, but this error sometimes reproduces at the 20 minutes mark, or at the 2 hour mark, and it does not even needs to involve GPU load with 3D games. Just normal web browsing and routine tasks may trigger it. Disabling the nvidia-powerd.service will make the system more stable and not crash under the 10 minutes mark, but it will eventually crash.

Important: It does not reproduces on X11even if I deliberately put load on the system and spend hours logged. Some of the pictures I’ve got.

hyprland, yesterday night after 1:45h of use:

sway, past week after 2h of use(same artifacts but since there is some personal data at the left I had to blur it with before posting here).

1 Like

wayland periodically entirely crashes since installing 580.82.09, every application quits and I need to restart the whole session, its odd.
This is the error:

kwin_wayland[1683]: kwin_scene_opengl: Invalid framebuffer status: "GL_FRAMEBUFFER_INCOMPLETE_ATTACHMENT"
systemd-coredump[14730]: [🡕] Process 1683 (kwin_wayland) of user 1000 dumped core.

This is with kde plasma

General information:

Distribution: Ubuntu 24.04.3 (Wayland)

Kernel: Linux 6.11.0-29-generic

Driver 580.65.06, nvidia-dkms package (not the nvidia-open driver)

GPU: RTX 3050 Laptop

Power profile: NVIDIA On-Demand

Use the system for anything other than games: no crash

Launch any game: the game crash after 10 to 20min

Re-launch a game that has already crashed: no crash

This happen only with driver version 580, tried to completely wipe the driver and all related config files and then re-install the driver, but the same issue persist

And adding more context here:

  • Doesn’t matter if using nvidiaor nvidia-open here, both crash
  • This does not reproduces with nouveau + zink

So literally the community is making better drivers than the company that is supposed to fix those issues.;

$ memtest_vulkan
https://github.com/GpuZelenograd/memtest_vulkan v0.5.0 by GpuZelenograd
To finish testing use Ctrl+C

1: Bus=0x01:00 DevId=0x249D   8GB NVIDIA GeForce RTX 3070 Laptop GPU (NVK GA104)
Standard 5-minute test of 1: Bus=0x01:00 DevId=0x249D   8GB NVIDIA GeForce RTX 3070 Laptop GPU (NVK GA104)
      1 iteration. Passed  0.0761 seconds  written:    1.0GB  39.1GB/sec        checked:    2.0GB  39.6GB/sec
     68 iteration. Passed  1.0099 seconds  written:   67.0GB 187.4GB/sec        checked:  134.0GB 205.4GB/sec
    435 iteration. Passed  5.0012 seconds  written:  367.0GB 214.2GB/sec        checked:  734.0GB 223.2GB/sec
   2617 iteration. Passed 30.0129 seconds  written: 2182.0GB 212.1GB/sec        checked: 4364.0GB 221.3GB/sec
   4782 iteration. Passed 30.0015 seconds  written: 2165.0GB 210.8GB/sec        checked: 4330.0GB 219.5GB/sec
   6936 iteration. Passed 30.0069 seconds  written: 2154.0GB 209.6GB/sec        checked: 4308.0GB 218.3GB/sec
   9083 iteration. Passed 30.0128 seconds  written: 2147.0GB 208.9GB/sec        checked: 4294.0GB 217.6GB/sec
  11224 iteration. Passed 30.0088 seconds  written: 2141.0GB 209.0GB/sec        checked: 4282.0GB 216.7GB/sec
  13359 iteration. Passed 30.0042 seconds  written: 2135.0GB 208.7GB/sec        checked: 4270.0GB 216.0GB/sec
  15490 iteration. Passed 30.0100 seconds  written: 2131.0GB 208.4GB/sec        checked: 4262.0GB 215.4GB/sec
  17617 iteration. Passed 30.0089 seconds  written: 2127.0GB 208.3GB/sec        checked: 4254.0GB 214.9GB/sec
  19740 iteration. Passed 30.0093 seconds  written: 2123.0GB 207.9GB/sec        checked: 4246.0GB 214.5GB/sec
Standard 5-minute test PASSed! Just press Ctrl+C unless you plan long test run.
Extended endless test started; testing more than 2 hours is usually unneeded
use Ctrl+C to stop it when you decide it's enough

memtest_vulkan indicates that there is no memory issues with my GPU as well, and no crash was triggered so far since I started using nouveau on my laptop.

1 Like