545 Driver Black Screen Pop!_OS 22.04 RTX4090

I’ve been fighting this one fore a good bit. Using 535 or 545 drivers upon boot, black screen using display port connection from RTX4090 to Alienware AW3423DW. I have to SSH into the box to get any logs as CTRL+ALT+F1,2,3,etc. does not give me a tty.

Also, I was normally able to see the disk crypt screen to enter password, but after upgrading to 545 that is blank as well, so I just have to guess when its time to enter the password.

λ nigel [~] → grep nvidia /lib/modprobe.d/* /etc/modprobe.d/*
/etc/modprobe.d/blacklist-framebuffer.conf:blacklist nvidiafb
/etc/modprobe.d/nvidia-graphics-drivers-kms.conf:# This file was generated by nvidia-driver-545
/etc/modprobe.d/nvidia-graphics-drivers-kms.conf:options nvidia-drm modeset=1
/etc/modprobe.d/system76-power.conf:blacklist i2c_nvidia_gpu
/etc/modprobe.d/system76-power.conf:alias i2c_nvidia_gpu off
λ nigel [~] → inxi -Gx --display
Graphics:
  Device-1: NVIDIA vendor: Gigabyte driver: nvidia v: 545.23.06 bus-ID: 02:00.0
  Display: server: X.org v: 1.21.1.4 driver: X: loaded: modesetting,nvidia
    unloaded: fbdev,nouveau,vesa gpu: nvidia
  Message: No GL data found on this system.
λ nigel [~] → dkms status
nvidia/545.23.06, 6.5.4-76060504-generic, x86_64: installed
system76/1.0.14~1684961628~22.04~8c2ff21, 6.4.6-76060406-generic, x86_64: installed
system76/1.0.14~1684961628~22.04~8c2ff21, 6.5.4-76060504-generic, x86_64: installed
system76_acpi/1.0.2~1689789919~22.04~03a5804, 6.4.6-76060406-generic, x86_64: installed (original_module exists)
system76_acpi/1.0.2~1689789919~22.04~03a5804, 6.5.4-76060504-generic, x86_64: installed (original_module exists)
system76-io/1.0.3~1695233384~22.04~0f86350, 6.4.6-76060406-generic, x86_64: installed
system76-io/1.0.3~1695233384~22.04~0f86350, 6.5.4-76060504-generic, x86_64: installed

nvidia-bug-report.log.gz (386.9 KB)

gdm.log (261.1 KB)

Seems you’re running a VM with nvidia passthrough so the primary graphics is the virtual vga which likely doesn’t support display offloading. Please create a nvidia-only xorg.conf. Or try to disable the virtual graphics.

You are correct, it is a VM running on proxmox with the PCI GPU passthrough. I forgot to change it back to the normal settings as I was debugging stuff. Please find a new bug report and gdm log. Also here is the smi info.

Its running through a Level1Techs KVM so the disconnect of the mouse/keyboard is me changing back to my laptop to write this post :)

λ nigel [~] → nvidia-smi
Sun Oct 22 11:16:59 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.06              Driver Version: 545.23.06    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 4090        Off | 00000000:02:00.0 Off |                  Off |
|  0%   42C    P8              14W / 450W |    252MiB / 24564MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      1497      G   /usr/lib/xorg/Xorg                          182MiB |
|    0   N/A  N/A      2098      G   /usr/bin/gnome-shell                         57MiB |
+---------------------------------------------------------------------------------------+

nvidia-bug-report.log.gz (449.2 KB)

gdm.log (574.5 KB)

The Xserver is starting correctly on the nvidia, detects your monitor, then shows the monitor disconnected when you switch the KVM away.
Still no output on the monitor?

That’s correct. The monitor just flashes like its getting a signal then then does not, then does again. I have a short youtube video that shows what is going on (pre-545 install so you can see the crypt screen on the Linux VM).

Its acting like it doesn’t see the monitor so it disconnects and try’s to reconnect. I’ve been in contact w/ the KVM manufacture ( Wendell @ Level1Techs ) and he’s saying that the KVM is invisible, its just an electrical connection between the two display port cables. Each cable im using is a DP2.1 fiber optic cable. In the video I also boot a Windows 11 VM that has no issues with the display on the same hardware.

The nvidia linux driver is extremely touchy when it comes to DisplayPort KVM switches. Did you already try a direct connection?

Yes I have and that seems to work correctly, or rather it did. I haven’t tried on the 545 driver version yet. On 535 it would work correctly going through the KVM with a DP 1.2 cable, unfortunately that then causes problems on the Windows VM. Is there some flag I can set to force the drive to play nicer with the KVM?

Also, Its hitting the black screen before I’ve switched away to another machine, so in theory, it would not even know it was on a KVM upon first boot.

Unfortunately, not. At least none I know of. It’s often hit-and-miss with kvm and driver versions and how many DP lanes at what speed are chosen (as seen in nvidia-settings)

Here is direct connect with a DP 1.4 cable:

Here is going through the KVM w/ the DP1.2 cable:

Would I have better luck w/ the open source driver? I’m not really trying to game on the Linux VM, its my development environment, so I would just need to access the cuda cores, etc. I would like native resolution of my monitor though (3440x1440)

Might be interesting what it negotiates when the monitor is not displaying anything, care to use vnc from another computer in that case?
I guess you’re talking of the “-open” version of the nvidia driver. Might be worth a shot using that and see if it handles DP connections differently.

I’ll setup the VNC connection later today and report back. Thanks so much for the help!

So this is very weird. While testing RDP/VNC the monitor just started working, bug gnome was acting very weird. I did manage to capture a screen shot of the monitor, and it looks like it only negotiated 2 lines on the displayport, but they were at 8.1gbps each. Im going to try the -open drivers next.

I guess the monitor started working because it was set to 60Hz instead of the full 175Hz.

Just a quick update. I got it working @175hz (Had to drop it back to 144hz because font color issue). So the magic combo was latest 545 drivers along with a less-expensive (at time of purchase) Fiberoptic DP cable going to KVM which has a more expensive Fiberoptic going to the monitor.

It would not function in any other configuration of DP cables, so expensive → expensive, less-expensive → less- expensive, etc. For reference here are KVM and cables that I’ve used in case any one else runs into this weird issue.

Level1Techs DP1.4 KVM:

FIBBR DP2.1 3m cable from KVM to Monitor:
https://www.amazon.com/FIBBR-Certified-DisplayPort-DP2-1-Supports/dp/B0C3CT6D6W?ref_=ast_sto_dp&th=1

BIFALE DP1.4 Cable from PC to KVM:
https://www.amazon.com/gp/product/B07X4W61YP/ref=ppx_yo_dt_b_asin_title_o07_s00?ie=UTF8&psc=1

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.