Blanc screen after suspend on Ubuntu 17.10 nvidia 390.12 driver

I have tried all the tricks found in https://devtalk.nvidia.com/default/topic/962231/linux/resume-from-suspend-freezes-system-gtx-970-arch-linux-kernel-4-4-4-7-nvidia-370-/post/4972479/#4972479 without success.

It think it is more or less duplicate with e.g.https://devtalk.nvidia.com/default/topic/1017185/linux/problem-with-resume-from-suspend-ubuntu-16-04-gt-940mx-/2

The weird thing is that the first time I do

$ systemctl suspend

all work as planned. Second time it gives a blanc screen.

This also happens using 387.34 driver version and appeared when I upgraded from 17.04 to 17.10. Before the upgrade suspend worked always perfectly.

Any ideas?

nvidia-bug-report.sh
https://www.dropbox.com/s/gc83o6bozshqa8d/nvidia-bug-report.log.gz?dl=0

This is the best and only solution so far: https://www.dropbox.com/s/u3pqtejcdggxpbr/NVIDIAtoNouveau.png?dl=0

There’s no suspend/resume cycle visible in your logs.

How do I get the cycle included? I need to reboot after “resume” so it’s kind of tricky.

dmesg starts at boot and I noticed nothing on other logs. Maybe there is something in kern.log which does not seem to be included to the nvidia-report. So here are

  1. https://www.dropbox.com/s/i728c8hjmmnliq9/nvidia-bug-report.succesful.log.gz?dl=0 Report right after first succesful suspend/resume. Because it works always the first time!

  2. https://www.dropbox.com/s/wy0v6ebk3p62c1p/kern-succesful-after-boot.log?dl=0 Kern.log about it.

  3. https://www.dropbox.com/s/95mk2v5cb0l6s9u/nvidia-bug-report-after-boot.log.gz.?dl=0 Report right after forced boot.

  4. https://www.dropbox.com/s/e4smhszpi4fpf64/kern-after-forced-boot.log?dl=0 Kern.log after forced shutdown and boot.

Install openssh, start the ssh server and login from another system when frozen.

Right, of course that works! Thank you!

Here they are:

https://www.dropbox.com/s/gc83o6bozshqa8d/nvidia-bug-report.log.gz?dl=0

https://www.dropbox.com/s/gv7e9asjcwb1h96/kern-unseccesfull.log?dl=0

I’ve taken a look at your logs and those are inconsistent with your description. I’m seeing one suspend/resume cycle, some time after that the xserver just closes, everything else is fine.
Journal logs are not included, so I can just speculate. It looks like the second suspend doesn’t work at all, gnome might be just crashing. Nvidia gpu/driver working fine. You should be able to just switch to VT using ctrl+alt+F1.
Otherwise, can you repeat the procedure and note the exact time when you issue the suspend commands and leave a minute time between those?

Yes, I think that is exactly the problem. It won’t do any suspend. I can even “feel it” or “see it” because the computer goes “suspend” or it closes too fast. The sound is also different.

No, I can’t. That is why running nvidia-bug-report.sh was pretty hard at the first place. Also I use xfce4, not gnome.

So looks like something in your system is blocking the 2nd suspend and I doubt it has to do with nvidia driver/hw as it’s not even reaching the point where it gets powered down.
Check your journals, either through ssh or make journal persistent and read it after re-boot using
sudo journalctl -b -1

Thank you generix again!

Here are the journal logs. The one with .rtf was made by Mac when SSH:d to Xubuntu,

https://www.dropbox.com/s/y27hmypezigy2rg/journals.rtf?dl=0
https://www.dropbox.com/s/89vysdi5kjnw3un/journal-boot.txt?dl=0

What about power-manager stuff? I don’t know what it means but that is the only difference I can see.

Feb 02 17:46:57 tipi-laptop dbus[1557]: [system] Rejected send message, 4 matched rules; type="method_call", sender=":1.45" (uid=1000 pid=4178 comm="xfce4-power-manager --restart --sm-client-id 2e791"

I already tried to disable power management options from XFCE UI but did not make any difference. But with a little bit of search I guess these power-management settings are set in many different levels. It is not very nice solutions to disable these services anyway. Any ideas?

I also found this: https://bugs.launchpad.net/ubuntu/+source/xubuntu-default-settings/+bug/1303736 and they all have NVIDIA modules installed. I don’t know if this is a NVIDIA or Linux kernel bug but they are obviously connected.

This looks like a concatenation of stupid things happening. I suspect on the second suspend, xfce is logging you out, don’t know why, so on resume lightdm is starting but then gpumanager freaks out, does stupid things so lightdm ends up in a start-stop loop.
Please check what happens if you use
nogpumanager
as kernel parameter.

No difference. Exactly the same. First suspend usually works and the second does not.

cat /var/log/gpu-manager.log 
log_file: /var/log/gpu-manager.log
last_boot_file: /var/lib/ubuntu-drivers-common/last_gfx_boot
new_boot_file: /var/lib/ubuntu-drivers-common/last_gfx_boot
can't access /run/u-d-c-fglrx-was-loaded file
can't access /opt/amdgpu-pro/bin/amdgpu-pro-px
Looking for fglrx modules in /lib/modules/4.13.0-32-generic/updates/dkms
Looking for nvidia modules in /lib/modules/4.13.0-32-generic/updates/dkms
Found nvidia module: nvidia_390.ko
Looking for amdgpu modules in /lib/modules/4.13.0-32-generic/updates/dkms
Is nvidia loaded? yes
Was nvidia unloaded? no
Is nvidia blacklisted? no
Is fglrx loaded? no
Was fglrx unloaded? no
Is fglrx blacklisted? no
Is intel loaded? yes
Is radeon loaded? no
Is radeon blacklisted? no
Is amdgpu loaded? no
Is amdgpu blacklisted? no
Is amdgpu versioned? no
Is amdgpu pro stack? no
Is nouveau loaded? no
Is nouveau blacklisted? yes
Is fglrx kernel module available? no
Is nvidia kernel module available? yes
Is amdgpu kernel module available? no
Vendor/Device Id: 8086:191b
BusID "PCI:0@0:2:0"
Is boot vga? yes
Vendor/Device Id: 10de:13d9
BusID "PCI:1@0:0:0"
Is boot vga? no
Skipping "/dev/dri/card1", driven by "i915"
Skipping "/dev/dri/card0", driven by "nvidia-drm"
Skipping "/dev/dri/card1", driven by "i915"
Skipping "/dev/dri/card0", driven by "nvidia-drm"
Skipping "/dev/dri/card1", driven by "i915"
Skipping "/dev/dri/card0", driven by "nvidia-drm"
Found "/dev/dri/card1", driven by "i915"
output 0:
	card1-eDP-1
Number of connected outputs for /dev/dri/card1: 1
Does it require offloading? yes
last cards number = 2
Has amd? no
Has intel? yes
Has nvidia? yes
How many cards? 2
Has the system changed? No
main_arch_path x86_64-linux-gnu, other_arch_path i386-linux-gnu
Current alternative: /usr/lib/nvidia-390/ld.so.conf
Current core alternative: (null)
Current egl alternative: /usr/lib/nvidia-390/ld.so.conf
Is nvidia enabled? yes
Is nvidia egl enabled? yes
Is fglrx enabled? no
Is mesa enabled? no
Is mesa egl enabled? no
Is pxpress enabled? no
Is prime enabled? no
Is prime egl enabled? no
Is nvidia available? yes
Is nvidia egl available? no
Is fglrx available? no
Is fglrx-core available? no
Is mesa available? yes
Is mesa egl available? yes
Is pxpress available? no
Is prime available? yes
Is prime egl available? no
Intel IGP detected
Intel hybrid system
can't access /usr/share/gpu-manager.d/force-dgpu-on file
force-dgpu-on hook off
Nvidia driver version 390.12 detected
/sys/class/dmi/id/product_version="Not Applicable"
/sys/class/dmi/id/product_name="P65_P67RGRERA"
1st try: bbswitch without quirks
Loading bbswitch with "load_state=-1 unload_state=1" parameters
can't access /usr/share/gpu-manager.d/hybrid-performance
intel_matches: 1, nvidia_matches: 1, intel_set: 1, nvidia_set: 1 x_options_matches: 4, accel_method_matches: 1
No need to modify xorg.conf. Path: /etc/X11/xorg.conf
No need to change the current bbswitch status

What about these in the log?

Intel hybrid system can’t access /usr/share/gpu-manager.d/force-dgpu-on file

and

can’t access /usr/share/gpu-manager.d/hybrid-performance

I have no idea what these files do but I don’t have them there.