Systemd's suspend-then-hibernate not working in NVIDIA Optimus laptop

My system:

  • EndeavourOS Linux x86_64
  • XPS 15 9500
  • 5.17.5-arch1-1 kernel
  • NVIDIA GeForce GTX 1650 Ti Mobile with nvidia-dkms 510.68.02-1 driver

Description of the problem:
I have an Optimus laptop with an NVIDIA Turing card and I have already configured the RTD3 power management following the Arch Wiki. I’ve also configured the system to preserve video memory after suspend and enabled nvidia-suspend, nvidia-resume, nvidia-hibernate and nvidia-persistenced service.
When I put the laptop into either suspend or hibernate state, using either systemctl suspend or systemctl hibernate, everything works as expected. However, if I use systemctl suspend-then-hibernate the system fails to enter in suspend state with the following error message:

mag 07 15:09:00 ikigai systemd[1]: Starting Suspend; Hibernate if not used for a period of time...
mag 07 15:09:00 ikigai systemd-sleep[292811]: Entering sleep state 'suspend'...
mag 07 15:09:07 ikigai systemd-sleep[292811]: Failed to put system to sleep. System resumed again: Input/output error
mag 07 15:09:09 ikigai systemd[1]: systemd-suspend-then-hibernate.service: Main process exited, code=exited, status=1/FAILURE
mag 07 15:09:09 ikigai systemd[1]: systemd-suspend-then-hibernate.service: Failed with result 'exit-code'.
mag 07 15:09:09 ikigai systemd[1]: Failed to start Suspend; Hibernate if not used for a period of time.

My guess is that there is no nvidia-suspend-then-hibernate service. Specifying NVreg_PreserveVideoMemoryAllocations=0 works, but only if the NVIDIA card is not in use: if it is, then the system sometimes goes to sleep and sometimes not, causing unexpected overheat (e.g. when I close the lid and put the laptop is in my bag).

Unfortunately my laptop does not support s3 (“deep standby”) but only s2idle (“modern standby”): thus when suspended the system still consumes a lot of power (30% in 10 hours) and for this reason I need to use the suspend-then-hibernate feature.

Am I missing something? Does someone else have this problem?

1 Like

I think I’ve found a workaround: I disabled nvidia-suspend, nvidia-hibernate and nvidia-resume and I’ve edited the /lib/systemd/system-sleep/nvidia as follows:

#!/bin/sh

case "$1" in
    pre)
        case "$SYSTEMD_SLEEP_ACTION" in
            suspend|hibernate)
                /usr/bin/nvidia-sleep.sh "$SYSTEMD_SLEEP_ACTION"
                ;;
            suspend-after-failed-hibernate)
                /usr/bin/nvidia-sleep.sh "suspend"
                ;;
        esac
        ;;
    post)
        /usr/bin/nvidia-sleep.sh "resume"
        ;;
esac

Following systemd’s documentation, I’ve found out that all the executables in /lib/systemd/system-sleep/ are run before entering sleep mode, and the first argument passed is either “pre” or “post”: therefore, when it’s “post” I simply call nvidia-sleep.sh with the “resume” argument while when it’s “pre” I call it with the value of the SYSTEMD_SLEEP_ACTION variable.

It feels like an hack but it works, should I consider this solved?

3 Likes

Its the only thing I found that works. Its a hack in a sense that it requires re-editing when updating Nvidia drivers, I guess. Thanks for the solution. Long live the hack :D (Seriously, I hope it won’t break with upcoming drivers).

@amrits Sorry to bug you, this issue is still present with driver 515.43.04.
Sounds like this would be a simple fix, if this “hack” is considered good enough.

2 Likes

I had to make one tweak to your code, and now suspend-then-hibernate is 100% working: I had to put sleep 4 after /usr/bin/nvidia-sleep.sh "resume", otherwise my laptop would hang after resuming from hibernation.

I suspect this is because the echo "resume" > /proc/driver/nvidia/suspend and echo "hibernate" > /proc/driver/nvidia/suspend calls (via /usr/bin/nvidia-sleep.sh) occur too rapidly for the nvidia driver to handle properly.

To debug, I put this at the top of /lib/systemd/system-sleep/nvidia:

echo "$(date) $1 $2 $SYSTEMD_SLEEP_ACTION" >> /var/log/my-sleep-log

Before adding the sleep, I saw:

Sun Jun 12 08:58:41 PM CDT 2022  pre  suspend-then-hibernate  suspend
Sun Jun 12 08:58:53 PM CDT 2022  post  suspend-then-hibernate  suspend
Sun Jun 12 08:58:54 PM CDT 2022  pre  suspend-then-hibernate  hibernate

After the sleep I see:

Sun Jun 12 09:28:19 PM CDT 2022  pre  suspend-then-hibernate  suspend
Sun Jun 12 09:28:31 PM CDT 2022  post  suspend-then-hibernate  suspend
Sun Jun 12 09:28:36 PM CDT 2022  pre  suspend-then-hibernate  hibernate
Sun Jun 12 09:29:18 PM CDT 2022  post  suspend-then-hibernate  hibernate

So it looks like before my tweak the nvidia driver wouldn’t properly prepare for hibernation, causing the system to crash/hang immediately after the first frame of video painted to screen, showing a frozen desktop.

While it would be nice if nvidia would write proper drivers, I would be happy if they at least rolled out this change to /lib/systemd/system-sleep/nvidia so suspend-then-hibernate isn’t broken by default (and so I don’t have to keep manually altering this file after package updates). If support can forward this to the dev team, it would be much appreciated.

2 Likes

Thanks for this hack - got suspend-hibernate working finally.

My system is:


+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.54.03              Driver Version: 535.54.03    CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Quadro P600                    Off | 00000000:26:00.0  On |                  N/A |
| 44%   60C    P0              N/A /  N/A |   1615MiB /  2048MiB |     39%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      2040      G   /usr/bin/gnome-shell                        356MiB |
|    0   N/A  N/A      2747      G   gjs                                         141MiB |
|    0   N/A  N/A      2916      G   /usr/bin/Xwayland                            17MiB |
|    0   N/A  N/A      6646      G   /usr/lib/firefox/firefox                    219MiB |
|    0   N/A  N/A      7312      G   /usr/lib/thunderbird/thunderbird            234MiB |
+---------------------------------------------------------------------------------------+

Just as a side note. With my small laptop with AMD-GPU hibernate-then-suspend was working basically out of the box.

I agree with charles15 that some nvidia developers should look into the issue to come with a solution which does not require work on the user’s side.

Hi,
Any progress on this? I’m trying to make it work on Fedora Silverblue 39 with Optimus Laptop, since I’m on immutable file system some hacks from this thread are not possible.

Is there any internal bug report created in Nvidia for this issue? If not, then could be great if somebody from Nvidia is aware of this bug. Suspend and hibernation alone works fine with Nvidia dGPU, but suspend-then-hibernate is broken.

When I disable Nvidia dGPU (with supergfxctl) suspend-then-hibernate works like a charm.

I did this:

  1. created a unit in /usr/lib/systemd/system/nvidia-suspend-then-hibernate.service
[Unit]
Description=NVIDIA system suspend-then-hibernate actions
Before=systemd-suspend-then-hibernate.service

[Service]
Type=oneshot
ExecStart=/usr/bin/logger -t suspend -s "nvidia-suspend-then-hibernate.service"
ExecStart=/usr/bin/nvidia-sleep.sh "hibernate"

[Install]
WantedBy=systemd-suspend-then-hibernate.service
  1. edit unit /usr/lib/systemd/system/nvidia-resume.service
[Unit]
Description=NVIDIA system resume actions
After=systemd-suspend.service
After=systemd-hibernate.service
After=systemd-suspend-then-hibernate.service

[Service]
Type=oneshot
ExecStart=/usr/bin/logger -t suspend -s "nvidia-resume.service"
ExecStart=/usr/bin/nvidia-sleep.sh "resume"

[Install]
WantedBy=systemd-suspend.service
WantedBy=systemd-hibernate.service
WantedBy=systemd-suspend-then-hibernate.service
  1. remove file /usr/lib/systemd/system-sleep/nvidia

#systemctl daemon-reload
#systemctl disable nvidia-resume.service
#systemctl enable nvidia-suspend.service
#systemctl enable nvidia-hibernate.service
#systemctl enable nvidia-suspend-then-hibernate.service
#systemctl enable nvidia-resume.service

for check

in /etc/systemd/sleep.conf set HibernateDelaySec=3min

warning, do not set the interval to less than 1 minute.

Command:

$systemctl suspend-then-hibernate

it works.

checked on debian 12 end gentoo systemd version 255 nvidia-drivers version 550.67