[Bug Report] Black X11 Screen and partial lockup when upgraded to 515.76 and dual RTX3060

Hi folks,

After upgrading to 515.76 on my system (Amd CPU, Asus Moterboard, 2 X RTX3060, see the nvidia-bug-report.log.gz for detailed configuration) I get a blank screen when I run startx. I can login remotely, I can take a nvidia-bug-report (although it takes a lot to finish) but reboot hangs (with the last message “kernel: nvidia-modeset: ERROR: GPU:0: Idling display engine timed out: 0x0000c67d:0:0:1119”) so I suspect a problem at kernel level.

Things I tried:

  • Downgrading to 515.65.01 it DOES solve the problem.
  • Disable Amd pstate driver, it does NOT solve the problem.
  • Disable iommu/PCI denylisting for a normal 2xGPU configuration, it does NOT solve the problem.
  • Downgrade to linux LTS 5.15.70, it does NOT solve the problem.

Let me know if you need more information,

Thanks!
nvidia-bug-report.log.gz (930.8 KB)

6 Likes

Same problem with nvidia: 515.76 kernel: 5.19.10 gpu: RTX3090
Downgrading to 515.65 works.

There’s something wrong with 515.76

Same problem here with 515.76 and just one RTX3060.

2 Likes

Likewise; I recently upgraded to an RTX 3060 Ti, only to be met with an imperishable black screen when starting up xorg/display manager after updating. Only solved by rolling back to 515.65.01, on a system using early KMS, the right modules loaded into initramfs, using DKMS, and pretty much all precautions taken when upgrading NV drivers. None of the other common troubleshooting steps listed on the Arch Wiki for example help in this case.

Arch Linux, Kernel 5.19.11, Plasma.

I’m noticing that the only people reporting no immediate issue with 515.76 seem to be Pascal users (or aren’t disclosing their cards) and every mention of the problem is from an Ampere user. The only exception seems to be one individual who is using the open source module on a 2070, but all of this might be anecdotal at this stage

1 Like

Same issue here. Hardware is 12900k with a 3090. Driver 515.76 caused black screen after grub. Downgrade to 515.65 was the only solution. Also had to downgrade linux, lib32, headers.

Spec Value
Kernel 5.19.11
Driver 515.76
PCIe GPU RTX 3060 12GB (MSI)
Integrated GPU AMD Cezanne
Motherboard B550-A MC-7C56
CPU AMD Ryzen 5 5600G
Distribution Gentoo

The driver prints out:

kernel: nvidia: loading out-of-tree module taints kernel.
kernel: nvidia: module license 'NVIDIA' taints kernel.
kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 242
kernel: nvidia 0000:10:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  515.76  Mon Sep 12 19:11:54 UTC 2022
kernel: nvidia_uvm: module uses symbols nvUvmInterfaceDisableAccessCntr from proprietary module nvidia, inheriting taint.
kernel: nvidia-uvm: Loaded the UVM driver, major device number 240.
kernel: nvidia-modeset: WARNING: GPU:0: Lost display notification (0:0x00000000); continuing.
kernel: nvidia-modeset: WARNING: GPU:0: Unable to read EDID for display device HSP HSG1074 (HDMI-0)
kernel: nvidia-modeset: ERROR: GPU:0: Idling display engine timed out: 0x0000c67d:0:0:1119
kernel: nvidia-modeset: ERROR: GPU:0: Idling display engine timed out: 0x0000c67e:0:0:1128
kernel: nvidia-modeset: ERROR: GPU:0: Idling display engine timed out: 0x0000c67e:1:0:1128

nvidia-bug-report.log.gz (73.9 KB)

This occurs with or without the integrated GPU enabled in firmware.

2 Likes

Ok nvidia, can we get some acknowledgment of this issue? We are seeing this on Arch linux, Gentoo, Endeavour linux and in the Reddit forums.

1 Like

Just updated today and BAM. I cant believe the devs can’t even test their code properly before putting it in prod.

How can I downgrade the drivers? Will I first need to downgrade my kernel? I tried pacman - U but it leads to nvidia being dependant on nvidia-utils which in turn is dependant on nvidia. So stupid lol

If you are running Arch, i would boot off a live usb. Chroot into the OS and downgrade from there. The way i did it was chroot into the os then set my repositories to a certain date(September 21 2022 in my case) then save, then pacman -Syyuu. This was on a fresh install so I had no files in the pacman cache. If you have files there you can use those but please follow the Arch wiki:
https://wiki.archlinux.org/title/Downgrading_packages

Hmm, I actually uninstalled and then reinstalled using the last cached version. Seems to be working fine. I would usually avoid reinstalling as it takes a few hours to get everything working…

That is great. I’m glad you were able to get up and running again. Hopefully nvidia solves the issue on the next update because we can’t update until then.

Following the suggestions from other folks (thanks!) on Arch Linux / Open NVidia driver bug tracker, this actually work for me on 515.76:

  1. I have a system with a RTX3060 connected to a HDMI monitor through a KVM switch (work monitor) and a RTX3060 connected directly to a DP monitor (calibrated for graphics work).

  2. I switch the KVM to other system, not the one with the RTX3060.

  3. I boot my system. Now the POST/linux console is on the DP monitor, usually it is on the HDMI. I login and run startx

  4. I switch the KVM back to the RTX3060 system and I have my usual dual display / GPU correctly working.

So it looks like there is something in the console initialization code specific to HDMI.

black screen over here, hdmi 3060 and 3090, fedora and arch respectively. had to downgrade

Kernel 5.19.12-AMD
Driver 515.76
PCIe GPU RTX 3090 24G
Motherboard B550-A MC-7C56
CPU AMD Ryzen 9 5950X
Distribution Arch
TV SONY XR-42A90k
PORT HDMI

Symptom

Black screen when booting from 515.76 using HDMI. System eventually hard freezes. Unable to change TTY.

TEMP FIX: Add nvidia-drm.modeset=0 to /.etc/default/grub and run sudo grub-mkconfig -o /boot/grub/grub.cfg
Temporarily resolves the problem, but causes other issues with my system. Best to just downgrade the drivers.

Same issue here!

Driver 515.76
Nvidia 3090
AMD Ryzen 9 5950X

Black screen when booting from 515.76 using HDMI. The system eventually hard freezes. Unable to change TTY.

I’ve the same issue

Same issue on Ubuntu, single RTX3080 with TV connected to HDMI.

I switched from HDMI to DisplayPort and 515.76 works fine. Hmm.

I think I might try DP out as well but I need to buy one first.