Suspend swap group failed / Resume swap group failed. Nvidia 390.25

Hi,

Just installed nvidia 390.25 here, and kernel is 4.15. As usual, suspend and hibernate does not work because system does not resume properly, in that only the terminal windows retain their state, all GUIs disappear but the workspaces on which they were running prior to suspend and hibernate are not released. And if I enter that GUI’s workspace, it shows the terminal window of the most recent workspace I was in. So for example, I had chromium running on workspace 3 and then I hibernated the system. After resuming, when I entered workspace 3, instead of showing chromium, it showed the Terminal window from workspace 1 (the workspace I was previously in)

window manager is i3wm
distro is Arch linux

Xorg logs are filled with multiple instances of the following error after resuming the system:

41 [  3655.967] (WW) NVIDIA(0): Suspend swap group failed.
  40 [  3655.967] (WW) NVIDIA(0): Resume swap group failed.
  39 [  3655.967] (WW) NVIDIA(0): Suspend swap group failed.
  38 [  3655.968] (WW) NVIDIA(0): Resume swap group failed.
  37 [  3655.968] (WW) NVIDIA(0): Suspend swap group failed.
  36 [  3655.968] (WW) NVIDIA(0): Resume swap group failed.
  35 [  3655.968] (WW) NVIDIA(0): Suspend swap group failed.
  34 [  3655.968] (WW) NVIDIA(0): Resume swap group failed.
  33 [  3655.969] (WW) NVIDIA(0): Suspend swap group failed.
  32 [  3655.969] (WW) NVIDIA(0): Resume swap group failed.
  31 [  3655.969] (WW) NVIDIA(0): Suspend swap group failed.
  30 [  3655.970] (WW) NVIDIA(0): Resume swap group failed.
  29 [  3655.970] (WW) NVIDIA(0): Suspend swap group failed.
  28 [  3655.970] (WW) NVIDIA(0): Resume swap group failed.
  27 [  3655.970] (WW) NVIDIA(0): Suspend swap group failed.
  26 [  3655.971] (WW) NVIDIA(0): Resume swap group failed.
  25 [  3655.971] (WW) NVIDIA(0): Suspend swap group failed.
  24 [  3655.971] (WW) NVIDIA(0): Resume swap group failed.
  23 [  3655.971] (WW) NVIDIA(0): Suspend swap group failed.
  22 [  3655.971] (WW) NVIDIA(0): Resume swap group failed.
  21 [  3655.971] (WW) NVIDIA(0): Suspend swap group failed.
  20 [  3655.972] (WW) NVIDIA(0): Resume swap group failed.
  19 [  3658.867] (WW) NVIDIA(0): Suspend swap group failed.
  18 [  3658.867] (WW) NVIDIA(0): Resume swap group failed.
  17 [  3658.867] (WW) NVIDIA(0): Suspend swap group failed.
  16 [  3658.867] (WW) NVIDIA(0): Resume swap group failed.
  15 [  3658.867] (WW) NVIDIA(0): Suspend swap group failed.
  14 [  3658.868] (WW) NVIDIA(0): Resume swap group failed.
  13 [  3658.868] (WW) NVIDIA(0): Suspend swap group failed.
  12 [  3658.872] (WW) NVIDIA(0): Resume swap group failed.
  11 [  3658.873] (WW) NVIDIA(0): Suspend swap group failed.
  10 [  3658.889] (WW) NVIDIA(0): Resume swap group failed.
   9 [  3658.889] (WW) NVIDIA(0): Suspend swap group failed.
   8 [  3658.890] (WW) NVIDIA(0): Resume swap group failed.
   7 [  3658.920] (WW) NVIDIA(0): Suspend swap group failed.
   6 [  3658.920] (WW) NVIDIA(0): Resume swap group failed.
   5 [  3658.920] (WW) NVIDIA(0): Suspend swap group failed.
   4 [  3658.920] (WW) NVIDIA(0): Resume swap group failed.
   3 [  3658.920] (WW) NVIDIA(0): Suspend swap group failed.
   2 [  3658.920] (WW) NVIDIA(0): Resume swap group failed.
   1 [  3658.920] (WW) NVIDIA(0): Suspend swap group failed.
2384 [  3658.920] (WW) NVIDIA(0): Resume swap group failed.
  32 [  3658.953] (**) Option "fd" "21"
  31 [  3658.953] (**) Option "fd" "21"
  30 [  3658.953] (II) event4  - (II) Chicony HP Business Slim Keyboard: (II) device removed
  29 [  3658.953] (**) Option "fd" "60"
  28 [  3658.953] (II) event3  - (II) Chicony HP Business Slim Keyboard: (II) device removed
  27 [  3658.953] (**) Option "fd" "20"
  26 [  3658.953] (II) event2  - (II) Power Button: (II) device removed
  25 [  3658.953] (**) Option "fd" "37"
  24 [  3658.953] (II) event1  - (II) Power Button: (II) device removed
  23 [  3658.953] (**) Option "fd" "38"
  22 [  3658.953] (II) event6  - (II) HP WMI hotkeys: (II) device removed
  21 [  3658.953] (**) Option "fd" "39"
  20 [  3658.953] (II) event5  - (II) PixArt HP USB Optical Mouse: (II) device removed
  19 [  3658.954] (**) Option "fd" "40"
  18 [  3658.954] (II) event0  - (II) Sleep Button: (II) device removed
  17 [  3658.954] (II) UnloadModule: "libinput"
  16 [  3658.954] (II) systemd-logind: releasing fd for 13:64
  15 [  3658.954] (EE) systemd-logind: failed to release device: Connection was disconnected before a reply was received
  14 [  3658.970] (II) UnloadModule: "libinput"
  13 [  3658.970] (II) systemd-logind: releasing fd for 13:69
  12 [  3658.970] (EE) systemd-logind: failed to release device: Connection is closed
  11 [  3658.999] (II) UnloadModule: "libinput"
  10 [  3658.999] (II) systemd-logind: releasing fd for 13:70
   9 [  3658.999] (EE) systemd-logind: failed to release device: Connection is closed
   8 [  3659.026] (II) UnloadModule: "libinput"
   7 [  3659.026] (II) systemd-logind: releasing fd for 13:65
   6 [  3659.026] (EE) systemd-logind: failed to release device: Connection is closed
   5 [  3659.038] (II) UnloadModule: "libinput"
   4 [  3659.038] (II) systemd-logind: releasing fd for 13:66
   3 [  3659.038] (EE) systemd-logind: failed to release device: Connection is closed
   2 [  3659.050] (II) UnloadModule: "libinput"
   1 [  3659.050] (II) systemd-logind: releasing fd for 13:67
2417 [  3659.050] (EE) systemd-logind: failed to release device: Connection is closed
   1 [  3659.065] (II) UnloadModule: "libinput"
   2 [  3659.065] (II) systemd-logind: not releasing fd for 13:68, still in use
   3 [  3659.065] (II) UnloadModule: "libinput"
   4 [  3659.065] (II) systemd-logind: releasing fd for 13:68
   5 [  3659.065] (EE) systemd-logind: failed to release device: Connection is closed
   6 [  3659.336] (WW) NVIDIA(0): Free swap group failed.
   7 [  3659.539] (II) NVIDIA(GPU-0): Deleting GPU-0 
   8 [  3659.540] (WW) xf86CloseConsole: KDSETMODE failed: Input/output error
   9 [  3659.540] (WW) xf86CloseConsole: VT_GETMODE failed: Input/output error
  10 [  3659.540] (WW) xf86CloseConsole: VT_ACTIVATE failed: Input/output error
  11 [  3659.540] (EE) systemd-logind: ReleaseControl failed: Connection is closed
  12 [  3659.540] (II) Server terminated successfully (0). Closing log file.

nvidia bug report:

nvidia-bug-report.log.gz (75.4 KB)

I am also receiving the same issue with the server terminating on me after “Free swap group failed.”

I also see this warning after simply switching tty:

Is there any update / information on this issue?

I suspect ‘swap group’ refers to Quadro sync, which doesn’t apply for anyone seeing the message. 390.25 seems to be broken in terms of vsync and other issues and completely unfit for kernel 4.15.

@generix, found https://bugs.launchpad.net/ubuntu/+bug/1747588 upon google search, just sharing…

I get the same result and warnings with kernel 4.14 as well.

Any idea if this suspend / hibernation issue would be fixed in one of the next updates? It would be really helpful, since I prefer to use the GPU for everything as it offers much better image quality than intel hd graphics.

Thanks, I appreciate your help on this.

It is sad, but devs don’t seem to reply to suspend/resume issues anymore.
I can’t say why.

You’re right that “swap group” here refers to the Quadro Sync feature. Quadro Sync is going through an overhaul and there’s an accounting bug in 390.25 that results in those messages. They’re completely harmless, other than filling the log. They should be fixed in the next release.

Sorry for the inconvenience!

Thanks for responding @aplattner, glad to hear it would get fixed in the next release. On a sidenote, is this bug also responsible for chromium crashing each time the system is woken up from suspend / hibernate?

Thanks again!

No, like I said, the messages should be completely harmless.

Is there a thread with a backtrace for the chromium crash?

Well there is this thread https://devtalk.nvidia.com/default/topic/1029484/linux/-various-all-distros-numerous-performance-amp-rendering-issues-on-390-25/1 about various performance issues, however on my card Quadro k420, Chromium works fine for the most part. I couldnt find a thread specifically about Chromium crashing on resuming from suspend.

Attaching logs after waking system from suspend.
Thanks for your help
nvidia-bug-report.log.gz (77.4 KB)

At least we managed to confirm that systemd-journald log rotation works as expected ;)

Hello,

I believe I’m experiencing the same bug as you, @kgandhiok.

My login session is frequently lost when the computer resumes from suspension
or hibernation. The computer takes me to the Gnome login screen.

With driver 384.111, about one in two resumptions fail this way.
With driver 390.25, it happens less frequently.

Debian 9.3

kernel 4.15.0-1-amd64, and kernel 4.14.0-0.bpo.3-amd64

GeForce 1060

I found only two suspicious things in nvidia-bug-report.log.gz, which was
freshly generated after the bug:

[32940.892161] [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
[32940.892198] cache: parent cpu15 should not be sleeping
[32940.892287] microcode: CPU15: patch_level=0x08001129
[32940.892650] CPU15 is up
[32940.894082] ACPI: Waking up from system sleep state S3

[32941.030214] nvidia 0000:23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
domain=0x000e address=0x0000000000000000 flags=0x0000]
https://pastebin.com/hQhyp8kC

and

[ 32952.583] (II) NVIDIA: Using 24576.00 MB of virtual memory for indirect
memory
[ 32952.583] (II) NVIDIA: access.
[ 32952.586] (II) NVIDIA(0): ACPI: failed to connect to the ACPI event daemon;
the daemon
[ 32952.586] (II) NVIDIA(0): may not be running or the “AcpidSocketPath” X
[ 32952.586] (II) NVIDIA(0): configuration option may not be set correctly.
When the
[ 32952.586] (II) NVIDIA(0): ACPI event daemon is available, the NVIDIA X
driver will
[ 32952.586] (II) NVIDIA(0): try to use it to receive ACPI event notifications.
For
[ 32952.586] (II) NVIDIA(0): details, please see the “ConnectToAcpid” and
[ 32952.586] (II) NVIDIA(0): “AcpidSocketPath” X configuration options in
Appendix B: X
[ 32952.586] (II) NVIDIA(0): Config Options in the README.
[ 32952.599] (II) NVIDIA(0): Setting mode “DFP-0:nvidia-auto-
select,DFP-1:nvidia-auto-select”
[ 32952.702] (==) NVIDIA(0): Disabling shared memory pixmaps
[ 32952.702] (==) NVIDIA(0): Backing store enabled
[ 32952.702] (==) NVIDIA(0): Silken mouse enabled
[ 32952.703] (==) NVIDIA(0): DPMS enabled

I originally reported this bug on an Ubuntu thread that looked to be the same
issue (https://bugs.launchpad.net/gnome-shell/+bug/1721428/comments/74)
and also (mistakenly) on the mailing list for the Debian nvidia-driver packaging team (https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=891960).

I have attached an “nvidia-bug-report” from a different, more recent instance of the bug.

Thank you
nvidia-bug-report.log (535 KB)

Will that fix be part of the 390.xx branch or only 395.xx(or whatever the next main driver version is)?

I can confirm the bug here. Just now I’ve done the following in a root shell to protect my SSD:

rm Xorg.0.log ; ln -s /dev/null Xorg.0.log

Upgraded to 390.42
Glad to report system now resumes from suspend and hibernate without any warnings. The “suspend swap group failed / resume swap group failed” warnings no longer appear in the xorg logs.
Thanks!

I think this is still happening for me with 390.42. But happy to see this conversation!

I am also encountering this on 390.30 Ubuntu 16.04 using the official NVIDA repos.

/usr/lib/gdm3/gdm-x-session[25029]: (WW) NVIDIA(0): Suspend swap group failed.
/usr/lib/gdm3/gdm-x-session[25029]: (WW) NVIDIA(0): Resume swap group failed.

I often get hangs between TTY switches, which is why I switched to using GDM instead of 16.04’s default lightdm (which along with the rest of the Unity DE has now been ditched in 18.04 hurrah!) but I do still get hangs with this message occurring a lot in the logs on inspection. :/