Driver does not wake GPU properly after suspend (Ubuntu 18.10 with branch 390, 410 and 415)

Issue on Ubuntu bugtracker:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1803179

Some added info. This bug is present in:
driver : nvidia-driver-415 - third-party free recommended
driver : nvidia-driver-410 - third-party free
driver : nvidia-driver-390 - distro non-free

GPU info:
== /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 ==
modalias : pci:v000010DEd00001C8Csv00001028sd0000087Cbc03sc02i00
vendor : NVIDIA Corporation
model : GP107M [GeForce GTX 1050 Ti Mobile]

When in the faulty state nvidia-smi displays a 100% GPU Utilization.
Using prime-select intel to switch GPU and killing X gives back your display environment using onboard GPU.

If needed I’ll run a nvidia-bug-report.sh.

Please try with kernel parameters
acpi_osi=! acpi_osi=“Windows 2009”

The newest Ubuntu 18.10 kernel makes video fail in many ways for me. Before chasing video problems, try the older kernel

There’s some kind of really important change in the kernels that come after

$ uname -a
Linux delllap-16 4.18.0-10-generic #11-Ubuntu SMP Thu Oct 11 15:13:55 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Upgrades after that one cause me to have 1) black screen fail to start display manager and 2) failure to light up after suspend. I gave up trying to fix as soon as I realized the kernel update caused the problems.

I’m not running any off-the-path nvidia drivers. Just xserver-xorg-video-nvidia-390 from Ubuntu 18.10 repositories.

The newest Ubuntu 18.10 kernel makes video fail in many ways for me. Before chasing video problems, try the older kernel

There’s some kind of really important change in the kernels that come after

$ uname -a
Linux delllap-16 4.18.0-10-generic #11-Ubuntu SMP Thu Oct 11 15:13:55 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Upgrades after that one cause me to have 1) black screen fail to start display manager and 2) failure to light up after suspend. I gave up trying to fix as soon as I realized the kernel update caused the problems.

I’m not running any off-the-path nvidia drivers. Just xserver-xorg-video-nvidia-390 from Ubuntu 18.10 repositories.

With 410 drivers and kernel 4.18.0-11-generic this issue is better solved with kernel parameters:
acpi_rev_override=1 acpi_osi=Linux nouveau.modeset=0 pcie_aspm=force drm.vblankoffdelay=1 scsi_mod.use_blk_mq=1 nouveau.runpm=0 mem_sleep_default=deep
in stead of:
acpi_osi=! acpi_osi="Windows 2009"

This because of the touchpad and other devices also being impacted negatively.

So: Is this a kernel issue or a nvidia driver issue?

It’s a PCI kernel bug:
https://bugzilla.kernel.org/show_bug.cgi?id=156341

I just tested this grub config file on Ubuntu with kernel 4.18.0-11-generic and the problem with black screen on resume from suspend is solved, AS LONG AS the display manager is lightdm. gdm3 is still a black screen for me in boot and resume.

GRUB_DEFAULT=0
GRUB_TIMEOUT_STYLE=menu

GRUB_TIMEOUT=0

GRUB_DISTRIBUTOR=lsb_release -i -s 2> /dev/null || echo Debian
GRUB_CMDLINE_LINUX_DEFAULT=“nosplash”
GRUB_CMDLINE_LINUX=“nouveau.blacklist=1 acpi_rev_override=1 acpi_osi=Linux nouveau.modeset=0 pcie_aspm=force drm.vblankoffdelay=1 scsi_mod.use_blk_mq=1 nouveau.runpm=0 mem_sleep_default=deep”

With this (after update-grub) and prime-select nvidia, suspend problem solved. I’ve tested several times today.

1 Like

Facing a similar issue. I tried @pauljohn 's workaround but I still face the same issue.

grep on the relevant details of my syslog gives me the following errors (was the same before I updated my grub file according to @pauljohn’s suggestions) (however, I am still using gdm3 as display manager)

Oct  6 15:44:50 nomitri-dl-laptop kernel: [ 3648.254341] nvidia-modeset: WARNING: GPU:0: Lost display notification (0:0x00000000); continuing
.
Oct  6 15:45:07 nomitri-dl-laptop kernel: [ 3665.047242] nvidia-modeset: ERROR: GPU:0: Idling display engine timed out: 0x0000c57d:0:0
Oct  6 15:45:09 nomitri-dl-laptop kernel: [ 3667.047261] nvidia-modeset: ERROR: GPU:0: Idling display engine timed out: 0x0000c57e:1:0
Oct  6 15:45:11 nomitri-dl-laptop kernel: [ 3669.047546] nvidia-modeset: ERROR: GPU:0: Idling display engine timed out: 0x0000c57d:0:0
Oct  6 15:45:13 nomitri-dl-laptop kernel: [ 3671.047563] nvidia-modeset: ERROR: GPU:0: Idling display engine timed out: 0x0000c57e:1:0
Oct  6 15:45:15 nomitri-dl-laptop kernel: [ 3673.047797] nvidia-modeset: ERROR: GPU:0: Idling display engine timed out: 0x0000c57d:0:0
Oct  6 15:45:17 nomitri-dl-laptop kernel: [ 3675.047810] nvidia-modeset: ERROR: GPU:0: Idling display engine timed out: 0x0000c57e:1:0
Oct  6 15:45:30 nomitri-dl-laptop kernel: [ 3688.509371] nvidia-modeset: WARNING: GPU:0: Lost display notification (0:0x00000000); continuing.
Oct  6 15:45:33 nomitri-dl-laptop kernel: [ 3691.512220] nvidia-modeset: WARNING: GPU:0: Lost display notification (0:0x00000000); continuing.
Oct  6 15:48:30 nomitri-dl-laptop kernel: [ 3867.774572] INFO: task nvidia-modeset/:697 blocked for more than 120 seconds.
Oct  6 15:48:30 nomitri-dl-laptop kernel: [ 3867.774575] nvidia-modeset/ D    0   697      2 0x80000000
Oct  6 15:48:30 nomitri-dl-laptop kernel: [ 3867.774608]  nvkms_kthread_q_callback+0x65/0xe0 [nvidia_modeset]
Oct  6 15:48:30 nomitri-dl-laptop kernel: [ 3867.774612]  _main_loop+0x76/0x140 [nvidia_modeset]
Oct  6 15:48:30 nomitri-dl-laptop kernel: [ 3867.774617]  ? _raw_q_schedule+0x80/0x80 [nvidia_modeset]
Oct  6 15:50:30 nomitri-dl-laptop kernel: [ 3988.606189] INFO: task nvidia-modeset/:697 blocked for more than 120 seconds.
Oct  6 15:50:30 nomitri-dl-laptop kernel: [ 3988.606192] nvidia-modeset/ D    0   697      2 0x80000000
Oct  6 15:50:30 nomitri-dl-laptop kernel: [ 3988.606213]  nvkms_kthread_q_callback+0x65/0xe0 [nvidia_modeset]
Oct  6 15:50:30 nomitri-dl-laptop kernel: [ 3988.606217]  _main_loop+0x76/0x140 [nvidia_modeset]
Oct  6 15:50:30 nomitri-dl-laptop kernel: [ 3988.606224]  ? _raw_q_schedule+0x80/0x80 [nvidia_modeset]

Any idea on how to fix this would be highly appreciated. This bug is getting really annoying.

I am running Ubuntu 18.04 on a Lenovo Legion Y740 with an NVIDIA RTX 2070 with CUDA 10.0 installed

$ uname -a
Linux nomitri-dl-laptop 5.0.0-31-generic #33~18.04.1-Ubuntu SMP Tue Oct 1 10:20:39 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

nvidia-smi
Sun Oct 6 16:18:16 2019
±----------------------------------------------------------------------------+
| NVIDIA-SMI 430.26 Driver Version: 430.26 CUDA Version: 10.2 |

Please check if enabling csm in bios works around it.

@generix thanks for the quick response.
how would that help?
I will try it, but I am always eager to learn on the reasoning behind things. Could you please, also explain me a bit more about the nature of the error message?
That way, I might be able to help myself in the future.

IDK, it’s just an observation that on some systems running into

ERROR: GPU:0: Idling display engine timed out

enabling (not using) the csm worked around it. Maybe due to additional vga bios initialization.

I see, the problem is that on my Lenovo Legion, switching to legacy is not so trivial

https://forums.lenovo.com/t5/ThinkPad-T400-T500-and-newer-T/Unable-to-select-UEFI-Legacy-Boot-in-BIOS-ThinkPad-T480/m-p/43612T88

As the necessary step

In BIOS go to Config > Storage > Controller Mode > change to AHCI mode from RST mode

Now you can able to change secure boot setting and able to enable Legacy Mode.

would clear all my data, this is not a solution to try out on a hunch for me.

Do you have any other ideas on what I could try or what the error and the warning before the error nvidia-modeset: WARNING: GPU:0: Lost display notification (0:0x00000000); continuing
might hint at as the underlying cause of this behavior?

I am facing the same issue as @fiedler.max. I am using the same device and config as @fiedler.max and every time face same error of “idling engine timed out while resuming ubuntu 18.04”. The only way is to reboot the device.
nvidia-bug-report.log.gz (1.96 MB)

Please check for a bios update.

I did try to update bios, it tells me i am on the latest version available for my laptop, anything else that i can try?

Thank you pauljohn, you saved my time.