Not Supported => Unbuntu 18.04.02 LTS Server / 18.10 + two RTX 2080 TI Founders + 3 monitors + kde plasma

Hey guys,
Has anyone been able to get multiple monitors with multiple GPU’s working on Unbuntu 18.04.02 LTS Server working?
My setup:
Unbuntu 18.04.02 LTS Server + two RTX 2080 TI Founders + 3 monitors + kde plasma
1 monitor on GPU0
2 monitors on GPU1

I get to the login screen fine, but after login the two side screens go black.

The mouse scrolls to the screens, but the no apps can go there.
I used NVDIA X Server Settings to get it working.

running 418.43 drivers.

Try creating a minimal /etc/X11/xorg.conf just containing

Section "Device"
    Identifier     "nvidia"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BusID          "PCI:1:0:0"
    Option         "BaseMosaic" "true"
    Option         "AllowEmptyInitialConfiguration"
EndSection

Replace the BusID with the PCI-ID (in decimal) of your first gpu.
Might require a NVlink bridge.

Nope that didn’t work. I have an nvidia NVLink bridge as well.

I also modified to the xorg.conf for a generic version of the above including gpu1.
I’m going to try to move to unbuntu server 18.10.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.43       Driver Version: 418.43       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 208...  Off  | 00000000:A1:00.0 Off |                  N/A |
| 41%   29C    P8     1W / 260W |      1MiB / 10989MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce RTX 208...  Off  | 00000000:C1:00.0  On |                  N/A |
| 41%   36C    P8     5W / 260W |    545MiB / 10986MiB |      1%      Default |
+-------------------------------+----------------------+----------------------+

If anyone has this working let me know. Right now, i’m feeling like i wasted 8k+ on a useless machine if i have to install windows.
nvidia-bug-report.log.gz (1.66 MB)

Please run nvidia-bug-report.sh as root while having the xorg,conf active and attach the resulting .gz file to your post. Hovering the mouse over an existing post of yours will reveal a paperclip icon.
https://devtalk.nvidia.com/default/topic/1043347/announcements/attaching-files-to-forum-topics-posts/

attached the log. here is the xorg.conf

xorg.zip (924 Bytes)

Hey,
So if i plug in the monitors into GPU0 it works.

So does this mean, that no one should buy dual NVIDIA RTX 2080 ti cards?

They don’t work with Unbuntu.

Wish i knew that before i dropped the 1400ish bucks on it.

The keyword for this is BaseMosaic
https://http.download.nvidia.com/XFree86/Linux-x86/325.15/README/xconfigoptions.html
Mainly meant for Quadros but should work for Geforce, too, with limitations (three monitors max.). For some people this works, for others, not.
Please use the xorg.conf from my post #2 and create a new nvidia-bug-report.log afterwards.

Nooooooooooooooo!

It worked perfect… then on a reboot it went to black screen… God this suck.

ok attaching logs here shortly.

i had to go to recovery mode to delete the xorg.conf to get it to boot.

Any other ideas?

nvidia-bug-report.log.gz (1.68 MB)

Create a new nvidia-bug-report.log, maybe some errors from the previous boot were caught.

Log uploaded. I had to safemode boot and delete xorg.conf to get it to boot
Here is the xorg. I’m just afraid to reboot, because it’ll die.

Proof NVLink is working:

GPU 0: GeForce RTX 2080 Ti (UUID: GPU-13e93ad2-7d12-8ab0-2332-a1582e87a7bb)
         Link 0: 25.781 GB/s
         Link 1: 25.781 GB/s
         Link 0, P2P is supported: true
         Link 0, Access to system memory supported: true
         Link 0, P2P atomics supported: true
         Link 0, System memory atomics supported: true
         Link 0, SLI is supported: true
         Link 0, Link is supported: false
         Link 1, P2P is supported: true
         Link 1, Access to system memory supported: true
         Link 1, P2P atomics supported: true
         Link 1, System memory atomics supported: true
         Link 1, SLI is supported: true
         Link 1, Link is supported: false
GPU 1: GeForce RTX 2080 Ti (UUID: GPU-fecd8180-a2a9-0c79-c90b-f7a0fb7dd50a)
         Link 0: 25.781 GB/s
         Link 1: 25.781 GB/s
         Link 0, P2P is supported: true
         Link 0, Access to system memory supported: true
         Link 0, P2P atomics supported: true
         Link 0, System memory atomics supported: true
         Link 0, SLI is supported: true
         Link 0, Link is supported: false
         Link 1, P2P is supported: true
         Link 1, Access to system memory supported: true
         Link 1, P2P atomics supported: true
         Link 1, System memory atomics supported: true
         Link 1, SLI is supported: true
         Link 1, Link is supported: false

xorg.conf.zip (396 Bytes)

After some time, you were running into problems with one of your monitors:

Mar 19 12:16:49 eqhome19 kernel: [   27.680801] nvidia-modeset: ERROR: GPU:0: Failure reading maximum pixel clock value for display device LG Electronics LG Ultra HD (DP-0).
Mar 19 12:16:49 eqhome19 kernel: [   27.686776] nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:0:0:0x00000033

Some time later, even the type of monitor couldn’t be detected anymore:

Mar 19 13:24:56 eqhome19 kernel: nvidia-modeset: WARNING: GPU:0: Unable to read EDID for display device DP-0
Mar 19 13:24:56 eqhome19 kernel: nvidia-modeset: ERROR: GPU:0: Failure reading maximum pixel clock value for display device DP-0.
Mar 19 13:24:56 eqhome19 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:0:0:0x00000033
Mar 19 13:24:56 eqhome19 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:2:0:0x00000033
Mar 19 13:24:56 eqhome19 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:4:0:0x00000033
Mar 19 13:24:56 eqhome19 kernel: nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c57e:6:1:0x00000033
Mar 19 13:24:56 eqhome19 kernel: nvidia-modeset: ERROR: GPU:0: DP-0: Failed to disable DisplayPort audio stream-0

Leading to the Xserver driver error

[    26.538] (EE) NVIDIA(GPU-0): Failed to select a display subsystem.

Defective cable/monitor/connector?
Please put the xorg.conf back in place and try with a single monitor, one after another.

I don’t i have any problems with hardware.

I’ll do what you said. Here is the state of the world.

  1. I can plug all three monitors into GPU0 and everything works.
  2. I can plug my primary monitor into GPU0 and 2- secondary monitors into GPU1. Configured with Xinerama off, screens come live and works with mouse, but doens’t work with OS.https://imgur.com/a/laEh5ws
  3. With same physical setup as above, if i turn Xinerama on, screens come alive at login, but after login same as above.https://imgur.com/a/HOXlTKq

my question is…

So i can always see my mouse on the screens, but the OS doesn’t.

Unless everything is plugged into one GPU. I bought two GPUs to take help my deep learning problems for my global business. However, the only thing i’m learning here, is that i should never have bought a second GPU.

Does this change your advice?

Generix,
i spent more than 30 hours on this.

I think i’m going to plug everything in GPU0 and just accept that fact that i burnt $1,400-1500 on that second chip + nvlink.

i would recommend for others to get a cheap chip, and spend the money i burned on AWS/GCP credits if you want to do machine learning.

Running this setup is not going to work.

If you hear anything in the future let me know. I’ll be willing to try.