Trouble suspending with 510.39.01, Linux 5.16.0: Freezing of tasks failed after 20.009 seconds

Unrelated to the suspend problem, you have “nomodeset” in your kernel cmdline, this previously disabled nvidia-drm.modeset=1 so I’m astonished this is working at all. Rather remove it to not run into any trouble anytime later.
I suspect the vt switch in nvidia-sleep.sh is triggering the bug, so

  • does switching to vt and back (several times) trigger this also?
  • does disabling nvidia-suspend and nvidia-resume in systemd prevent this?
1 Like

I removed nomodeset and video= from the cmdline. It’s been over a week with no issues and no display corruption. Who knew :)

I’m seeing a similar suspend failure on Fedora 35, kernel 5.16.3, nvidia 510.39.01, using gnome on wayland, except that I don’t have nomodeset and video= in cmdline. How were you able to get around the suspend issue?

Disabling nvidia-suspend, nvidia-resume, and removing NVreg_PreserveVideoMemoryAllocations=1 seems to fix suspend but leads to artifacts on wake.

[  102.797810] PM: suspend entry (s2idle)
[  102.807629] Filesystems sync: 0.009 seconds
[  102.807804] Freezing user space processes ... 
[  122.808924] Freezing of tasks failed after 20.001 seconds (1 tasks refusing to freeze, wq_busy=0):
[  122.808951] task:gnome-shell     state:D stack:    0 pid: 2932 ppid:  2623 flags:0x00000004
[  122.808959] Call Trace:
[  122.808961]  <TASK>
[  122.808967]  __schedule+0x2d6/0x10b0
[  122.808981]  schedule+0x4e/0xc0
[  122.808986]  rwsem_down_read_slowpath+0x310/0x350
[  122.808993]  nvkms_ioctl_from_kapi+0x27/0x90 [nvidia_modeset]
[  122.809036]  _nv000092kms+0x42/0x50 [nvidia_modeset]
[  122.809090]  ? nv_drm_framebuffer_destroy+0x3b/0x50 [nvidia_drm]
[  122.809099]  ? drm_mode_rmfb+0x188/0x1c0 [drm]
[  122.809149]  ? drm_mode_rmfb+0x1c0/0x1c0 [drm]
[  122.809196]  ? drm_ioctl_kernel+0x8c/0x120 [drm]
[  122.809237]  ? drm_ioctl+0x220/0x3e0 [drm]
[  122.809277]  ? drm_mode_rmfb+0x1c0/0x1c0 [drm]
[  122.809324]  ? do_unlinkat+0x13f/0x2b0
[  122.809332]  ? security_file_ioctl+0x32/0x50
[  122.809337]  ? __x64_sys_ioctl+0x82/0xb0
[  122.809341]  ? do_syscall_64+0x3b/0x90
[  122.809346]  ? entry_SYSCALL_64_after_hwframe+0x44/0xae
[  122.809352]  </TASK>

I spoke too soon; I still see these in dmesg, though suspend seems to work when I ask for it explicitly:

[87732.534435] Freezing user space processes ... 
[87752.537224] Freezing of tasks failed after 20.002 seconds (1 tasks refusing to freeze, wq_busy=0):
[87752.537234] task:gnome-shell     state:D stack:    0 pid:2607064 ppid:2607002 flags:0x00000004
[87752.537238] Call Trace:
[87752.537239]  <TASK>
[87752.537241]  __schedule+0x265/0x700
[87752.537248]  ? find_busiest_group+0xeb/0xa60
[87752.537252]  schedule+0x49/0xd0
[87752.537254]  rwsem_down_read_slowpath+0x315/0x360
[87752.537258]  ? __kmalloc+0x1a4/0x2d0
[87752.537261]  nvkms_ioctl_from_kapi+0x22/0x90 [nvidia_modeset]
[87752.537275]  _nv002056kms+0x126c/0x2710 [nvidia_modeset]
[87752.537291]  ? nv_drm_internal_framebuffer_create+0x24d/0x8b0 [nvidia_drm]
[87752.537295]  ? nv_drm_exit+0x310/0x370 [nvidia_drm]
[87752.537298]  ? drm_internal_framebuffer_create+0x3a8/0x4e0
[87752.537301]  ? drm_mode_addfb2+0x2c/0xb0
[87752.537303]  ? drm_mode_addfb_ioctl+0x10/0x10
[87752.537305]  ? drm_ioctl_kernel+0xb1/0x140
[87752.537307]  ? rm_ioctl+0x63/0xb0 [nvidia]
[87752.537484]  ? drm_ioctl+0x225/0x410
[87752.537486]  ? drm_mode_addfb_ioctl+0x10/0x10
[87752.537488]  ? __x64_sys_futex+0x6e/0x1d0
[87752.537491]  ? __x64_sys_ioctl+0x8d/0xb0
[87752.537494]  ? do_syscall_64+0x38/0xc0
[87752.537496]  ? entry_SYSCALL_64_after_hwframe+0x44/0xae
[87752.537499]  </TASK>

I would disable preserving video memory allocations but then my screen is unusable on wake.

Interestingly for me, it doesn’t actually suspend. It goes to s2idle (no video signal) and wakes back up after 20 seconds.

what if you systemctl isolate multi-user.target and systemctl suspend… assuming you’re on systemd

Same thing - it just tries to suspend for 20 seconds and then it wakes back up to the gnome login screen.

NVreg_EnableS0ixPowerManagement=1 works for me with Wayland session.

Had the same issue on Fedora 35 after upgrade to nvidia 510.47.03 and kernel 5.16.5.

This fixed it for me, working suspend/resume without graphics corruption:

  1. Uninstall the package “xorg-x11-drv-nvidia-power”.
  2. Reboot.
  3. Select GNOME as session during logon, not “GNOME on Wayland”.

I have this problem with 510.60.02 and Linux 5.17.3

I found a solution.

gnome-shell is trying to talk to the NVIDIA driver after it has already gone into suspend, so it can’t respond. Linux tries to freeze the task, but fails because gnome-shell is waiting for a response from the driver and can’t be frozen.

The solution is to manually suspend gnome-shell using the STOP signal before the NVIDIA driver goes to suspend. Then use the CONT signal on resume.

/usr/local/bin/suspend-gnome-shell.sh:

#!/bin/bash

case "$1" in
    suspend)
        killall -STOP gnome-shell
        ;;
    resume)
        killall -CONT gnome-shell
        ;;
esac

/etc/systemd/system/gnome-shell-suspend.service:

[Unit]
Description=Suspend gnome-shell
Before=systemd-suspend.service
Before=systemd-hibernate.service
Before=nvidia-suspend.service
Before=nvidia-hibernate.service

[Service]
Type=oneshot
ExecStart=/usr/local/bin/suspend-gnome-shell.sh suspend

[Install]
WantedBy=systemd-suspend.service
WantedBy=systemd-hibernate.service

/etc/systemd/system/gnome-shell-resume.service:

[Unit]
Description=Resume gnome-shell
After=systemd-suspend.service
After=systemd-hibernate.service
After=nvidia-resume.service

[Service]
Type=oneshot
ExecStart=/usr/local/bin/suspend-gnome-shell.sh resume

[Install]
WantedBy=systemd-suspend.service
WantedBy=systemd-hibernate.service

Then just enable the two new systemd units:

systemctl daemon-reload
systemctl enable gnome-shell-suspend
systemctl enable gnome-shell-resume

This should interrupt gnome-shell in time so it’s not trying to access the graphics hardware. It worked for me.

5 Likes

I tested your solution and worked perfectly. I think we just need to test with other suspend-related options enabled, like “NVreg_PreserveVideoMemoryAllocations” and “NVreg_EnableS0ixPowerManagement” to see if there are no conflicts. Maybe your solution could be implemented by distros or GNOME can tweak the code to avoid the problem you found.

This also seems to solve it on my end for Fedora 36 and a 3080! just make sure to sudo chmod +x on the user script for anyone trying this!

This solution very unfortunately breaks “Resume from Hibernation” in systems that have no support for S0ix - I have a new alderlake H670 motherboard that does not support S0ix and has no option to enable it in BIOS - as tested with Intel’s S0ix support testing script.

This workaround does enable Resume from Suspend to work which was nice however non-working hibernation is a show-stopper for me. The system would try to wake up from hibernation but it would fail with this error and the restart a new fresh session:

Jul 03 19:16:06 nahuatl kernel: PM: hibernation: Failed to load image, recovering.
Jul 03 19:16:06 nahuatl kernel: nvidia 0000:01:00.0: PM: failed to quiesce async: error -5
Jul 03 19:16:06 nahuatl kernel: PM: dpm_run_callback(): pci_pm_freeze+0x0/0xd0 returns -5
Jul 03 19:16:06 nahuatl kernel: PM: pci_pm_freeze(): nv_pmops_freeze+0x0/0x20 [nvidia] returns -5
Jul 03 19:16:06 nahuatl kernel: NVRM: GPU 0000:01:00.0: PreserveVideoMemoryAllocations module parameter is set. System Power Management attempted without driver procfs suspend interface. Please refer to the ‘Configuring Power Management Support’

I tested with driver 515, 510 and 470, this was in ubuntu 22.04 / wayland
The error related to “PreserveVideoMemoryAllocations module parameter is set” is weird, only happens when resuming from hibernate but not when resuming from Suspend. The nvidia power mgmt services were all loaded and enabled, with all 3 versions of the driver I tested.

I had to revert to xorg where both suspend-to-ram and hibernation work just fine. It’s disappointing considering all focus for development is on wayland and it’s been so for years. Nvidia team please step up and sort this out please, without requiring S0ix that’s not so commonly supported.

Interesting, and solves my suspend fails after 20 seconds problem with a slightly elderly but still useful NVIDIA GEForce GTX 750 Ti card and the 515.57 driver under Wayland on Fedora 36.
I’m not qualified to judge whether this is a fix or a workaround, as I don’t know enough about systemd, but I’m certainly pleased with the result and it should be known more widely. I hope my commenting helps draw attention to the original post.
Also, I’d be happy to help with further testing…

Thank you Devyn,

Neil

For me this fix works with “options nvidia NVreg_PreserveVideoMemoryAllocations=1”

I’m very happy to do further testing if people have suggestions. (I’d even be happy to try to reactivate hibernate and try it, at least on an experimental basis - for my use condition suspend and screens out provides the power saving that I want.)

Neil

BTW, what font is this - the letter forms are crisp and clear, but the brackets and braces ( [ { are all a bit close to square brace to my tired old eyes!

NVidia GTX 750 Ti, driver 515.57, fedora 36, kernel 5.18.11-lqx1.0.fc36.x86_64

For me this fix works with “options nvidia NVreg_PreserveVideoMemoryAllocations=1” NVreg_TemporaryFilePath=/var/tmp
also nvidia-{suspend,resume}.service enabled

OS: Ubuntu 22.04 LTS x86_64
Host: 20URS01L00 ThinkPad T15g Gen 1
Kernel: 5.17.0-1013-oem
GPU: NVIDIA GeForce RTX 2070 SUPER Mobile / Max-Q (dedicate graphics only)

As I didn’t see a pre-existing ticket, just to make sure the GNOME developers are aware of this I logged gnome-shell#5772 referencing this thread.

I haven’t tried the workaround but I’m experiencing the same issue on:

  • Fedora 36 Workstation
  • GNOME 42
  • Kernel 5.18.17
  • GTX 1070 :: 515.65.01
1 Like

This workaround also worked for me, thank you. Only tested with suspend (deep), I’ve not tried hibernate yet but may do soon. NVreg_PreserveVideoMemoryAllocations=1 is set for me.

I just tested it, it seems to work for me when using suspension mode (didnt test hybernate). Thank you mate!

Fedora 37 Workstation
Gnome 43.1
Kernel: 6.0.9-602.inttf.fc37.x86_64
GTX 1070: 520.56.06