Systemd's suspend-then-hibernate not working in NVIDIA Optimus laptop

My system:

  • EndeavourOS Linux x86_64
  • XPS 15 9500
  • 5.17.5-arch1-1 kernel
  • NVIDIA GeForce GTX 1650 Ti Mobile with nvidia-dkms 510.68.02-1 driver

Description of the problem:
I have an Optimus laptop with an NVIDIA Turing card and I have already configured the RTD3 power management following the Arch Wiki. I’ve also configured the system to preserve video memory after suspend and enabled nvidia-suspend, nvidia-resume, nvidia-hibernate and nvidia-persistenced service.
When I put the laptop into either suspend or hibernate state, using either systemctl suspend or systemctl hibernate, everything works as expected. However, if I use systemctl suspend-then-hibernate the system fails to enter in suspend state with the following error message:

mag 07 15:09:00 ikigai systemd[1]: Starting Suspend; Hibernate if not used for a period of time...
mag 07 15:09:00 ikigai systemd-sleep[292811]: Entering sleep state 'suspend'...
mag 07 15:09:07 ikigai systemd-sleep[292811]: Failed to put system to sleep. System resumed again: Input/output error
mag 07 15:09:09 ikigai systemd[1]: systemd-suspend-then-hibernate.service: Main process exited, code=exited, status=1/FAILURE
mag 07 15:09:09 ikigai systemd[1]: systemd-suspend-then-hibernate.service: Failed with result 'exit-code'.
mag 07 15:09:09 ikigai systemd[1]: Failed to start Suspend; Hibernate if not used for a period of time.

My guess is that there is no nvidia-suspend-then-hibernate service. Specifying NVreg_PreserveVideoMemoryAllocations=0 works, but only if the NVIDIA card is not in use: if it is, then the system sometimes goes to sleep and sometimes not, causing unexpected overheat (e.g. when I close the lid and put the laptop is in my bag).

Unfortunately my laptop does not support s3 (“deep standby”) but only s2idle (“modern standby”): thus when suspended the system still consumes a lot of power (30% in 10 hours) and for this reason I need to use the suspend-then-hibernate feature.

Am I missing something? Does someone else have this problem?

1 Like

I think I’ve found a workaround: I disabled nvidia-suspend, nvidia-hibernate and nvidia-resume and I’ve edited the /lib/systemd/system-sleep/nvidia as follows:

#!/bin/sh

case "$1" in
    pre)
        case "$SYSTEMD_SLEEP_ACTION" in
            suspend|hibernate)
                /usr/bin/nvidia-sleep.sh "$SYSTEMD_SLEEP_ACTION"
                ;;
            suspend-after-failed-hibernate)
                /usr/bin/nvidia-sleep.sh "suspend"
                ;;
        esac
        ;;
    post)
        /usr/bin/nvidia-sleep.sh "resume"
        ;;
esac

Following systemd’s documentation, I’ve found out that all the executables in /lib/systemd/system-sleep/ are run before entering sleep mode, and the first argument passed is either “pre” or “post”: therefore, when it’s “post” I simply call nvidia-sleep.sh with the “resume” argument while when it’s “pre” I call it with the value of the SYSTEMD_SLEEP_ACTION variable.

It feels like an hack but it works, should I consider this solved?

2 Likes

Its the only thing I found that works. Its a hack in a sense that it requires re-editing when updating Nvidia drivers, I guess. Thanks for the solution. Long live the hack :D (Seriously, I hope it won’t break with upcoming drivers).

@amrits Sorry to bug you, this issue is still present with driver 515.43.04.
Sounds like this would be a simple fix, if this “hack” is considered good enough.

2 Likes

I had to make one tweak to your code, and now suspend-then-hibernate is 100% working: I had to put sleep 4 after /usr/bin/nvidia-sleep.sh "resume", otherwise my laptop would hang after resuming from hibernation.

I suspect this is because the echo "resume" > /proc/driver/nvidia/suspend and echo "hibernate" > /proc/driver/nvidia/suspend calls (via /usr/bin/nvidia-sleep.sh) occur too rapidly for the nvidia driver to handle properly.

To debug, I put this at the top of /lib/systemd/system-sleep/nvidia:

echo "$(date) $1 $2 $SYSTEMD_SLEEP_ACTION" >> /var/log/my-sleep-log

Before adding the sleep, I saw:

Sun Jun 12 08:58:41 PM CDT 2022  pre  suspend-then-hibernate  suspend
Sun Jun 12 08:58:53 PM CDT 2022  post  suspend-then-hibernate  suspend
Sun Jun 12 08:58:54 PM CDT 2022  pre  suspend-then-hibernate  hibernate

After the sleep I see:

Sun Jun 12 09:28:19 PM CDT 2022  pre  suspend-then-hibernate  suspend
Sun Jun 12 09:28:31 PM CDT 2022  post  suspend-then-hibernate  suspend
Sun Jun 12 09:28:36 PM CDT 2022  pre  suspend-then-hibernate  hibernate
Sun Jun 12 09:29:18 PM CDT 2022  post  suspend-then-hibernate  hibernate

So it looks like before my tweak the nvidia driver wouldn’t properly prepare for hibernation, causing the system to crash/hang immediately after the first frame of video painted to screen, showing a frozen desktop.

While it would be nice if nvidia would write proper drivers, I would be happy if they at least rolled out this change to /lib/systemd/system-sleep/nvidia so suspend-then-hibernate isn’t broken by default (and so I don’t have to keep manually altering this file after package updates). If support can forward this to the dev team, it would be much appreciated.

1 Like