Cannot get working config of SLI Mosaic on CentOS Stream

I have a box with CentOS 8 Stream and the following cards:

[root@alta X11]# lspci | grep -i nvidia | grep -i vga
0000:17:00.0 VGA compatible controller: NVIDIA Corporation GA102 [GeForce RTX 3090] (rev a1)
0000:65:00.0 VGA compatible controller: NVIDIA Corporation GP107GL [Quadro P620] (rev a1)
0000:b3:00.0 VGA compatible controller: NVIDIA Corporation GP107GL [Quadro P620] (rev a1)

I have WaylandEnable=false in /etc/gdm/custom.conf

A single monitor is connected to the last Quadro card.

If I try to start X with no xorg.conf I get:

[ 6785.383] (EE) NVIDIA(GPU-0): Failed to find a valid Base Mosaic configuration.
[ 6785.383] (EE) NVIDIA(GPU-0): Invalid Base Mosaic configuration 1 of 4:
[ 6785.383] (EE) NVIDIA(GPU-0): GPUs:
[ 6785.383] (EE) NVIDIA(GPU-0): 1) NVIDIA GPU at PCI:101:0:0
[ 6785.383] (EE) NVIDIA(GPU-0): 2) NVIDIA GPU at PCI:179:0:0
[ 6785.383] (EE) NVIDIA(GPU-0): Errors:
[ 6785.383] (EE) NVIDIA(GPU-0): - Does not contain GPU selected for this X screen
[ 6785.383] (EE) NVIDIA(GPU-0): Invalid Base Mosaic configuration 2 of 4:
[ 6785.383] (EE) NVIDIA(GPU-0): GPUs:
[ 6785.383] (EE) NVIDIA(GPU-0): 1) NVIDIA GPU at PCI:23:0:0
[ 6785.383] (EE) NVIDIA(GPU-0): 2) NVIDIA GPU at PCI:101:0:0
[ 6785.383] (EE) NVIDIA(GPU-0): 3) NVIDIA GPU at PCI:179:0:0
[ 6785.383] (EE) NVIDIA(GPU-0): Errors:
[ 6785.383] (EE) NVIDIA(GPU-0): - The video link was not detected
[ 6785.383] (EE) NVIDIA(GPU-0): - Chipset not approved for SLI
[ 6785.383] (EE) NVIDIA(GPU-0): - GPU PCI IDs do not match
[ 6785.383] (EE) NVIDIA(GPU-0): Invalid Base Mosaic configuration 3 of 4:
[ 6785.383] (EE) NVIDIA(GPU-0): GPUs:
[ 6785.383] (EE) NVIDIA(GPU-0): 1) NVIDIA GPU at PCI:23:0:0
[ 6785.383] (EE) NVIDIA(GPU-0): 2) NVIDIA GPU at PCI:179:0:0
[ 6785.383] (EE) NVIDIA(GPU-0): Errors:
[ 6785.383] (EE) NVIDIA(GPU-0): - The video link was not detected
[ 6785.383] (EE) NVIDIA(GPU-0): - GPU PCI IDs do not match
[ 6785.383] (EE) NVIDIA(GPU-0): - Unknown error
[ 6785.383] (EE) NVIDIA(GPU-0): Invalid Base Mosaic configuration 4 of 4:
[ 6785.383] (EE) NVIDIA(GPU-0): GPUs:
[ 6785.383] (EE) NVIDIA(GPU-0): 1) NVIDIA GPU at PCI:23:0:0
[ 6785.383] (EE) NVIDIA(GPU-0): 2) NVIDIA GPU at PCI:101:0:0
[ 6785.383] (EE) NVIDIA(GPU-0): Errors:
[ 6785.383] (EE) NVIDIA(GPU-0): - The video link was not detected
[ 6785.383] (EE) NVIDIA(GPU-0): - GPU PCI IDs do not match
[ 6785.383] (EE) NVIDIA(GPU-0): - Unknown error
[ 6785.383] (WW) NVIDIA(GPU-0): Failed to find a valid Base Mosaic configuration for the
[ 6785.383] (WW) NVIDIA(GPU-0): NVIDIA graphics device PCI:23:0:0. Please see Chapter 30:
[ 6785.383] (WW) NVIDIA(GPU-0): Configuring SLI and Multi-GPU Mosaic in the README for
[ 6785.383] (WW) NVIDIA(GPU-0): troubleshooting suggestions.
[ 6785.383] (EE) NVIDIA(GPU-0): Only one GPU will be used for this X screen.
[ 6785.384] (EE) NVIDIA(GPU-0): Failed to select a display subsystem.
[ 6785.384] (EE) NVIDIA(0): Failing initialization of X screen

If I run with simply

Section “Device”
Identifier “Videocard0”
Driver “nvidia”
BusID “PCI:179:0:0”
EndSection

I get

[ 6964.585] (EE) NVIDIA(GPU-0): Failed to select a display subsystem.
[ 6964.586] (EE) NVIDIA(GPU-0): Only one GPU will be used for this X screen.
[ 6964.586] (EE) NVIDIA(GPU-0): The NVIDIA graphics device PCI:179:0:0 is part of an active
[ 6964.586] (EE) NVIDIA(GPU-0): SLI configuration and is currently unavailable for single
[ 6964.586] (EE) NVIDIA(GPU-0): GPU rendering. Please see Chapter 30: Configuring SLI and
[ 6964.586] (EE) NVIDIA(GPU-0): Multi-GPU Mosaic in the README for troubleshooting
[ 6964.586] (EE) NVIDIA(GPU-0): information.
[ 6964.586] (EE) *** Aborting ***
[ 6964.586] (EE) NVIDIA(0): Failing initialization of X screen

If I run ‘nvidia-xconfig --sli=mosaic’ and use that xorg.conf which has:

Section “Device”
Identifier “Device0”
Driver “nvidia”
VendorName “NVIDIA Corporation”
EndSection

Section “Screen”
Identifier “Screen0”
Device “Device0”
Monitor “Monitor0”
DefaultDepth 24
Option “SLI” “mosaic”
SubSection “Display”
Depth 24
EndSubSection
EndSection

I get the same error as originally

Is there anyway to make this configuration work?

Did you already try
Section “Device”
Identifier “Videocard0”
Driver “nvidia”
BusID “PCI:101:0:0”
EndSection

Is there a monitor connected to the 3090? If so, please try disconnect it before booting.

If I set it to use BusID “PCI:101:0:0” I get

[ 21569.056] (EE) NVIDIA(GPU-0): The NVIDIA graphics device PCI:101:0:0 bound to this Base
[ 21569.056] (EE) NVIDIA(GPU-0): Mosaic X screen is not the Base Mosaic parent device.
[ 21569.056] (EE) NVIDIA(GPU-0): This configuration is not currently supported. Please add
[ 21569.056] (EE) NVIDIA(GPU-0): ‘BusID “PCI:179:0:0”’ to the Base Mosaic “Device” section
[ 21569.056] (EE) NVIDIA(GPU-0): in the X configuration file.
[ 21569.056] (EE) NVIDIA(GPU-0): Only one GPU will be used for this X screen.
[ 21569.057] (EE) NVIDIA(GPU-0): The NVIDIA graphics device PCI:179:0:0 is part of an active
[ 21569.057] (EE) NVIDIA(GPU-0): SLI configuration and is currently unavailable for single
[ 21569.057] (EE) NVIDIA(GPU-0): GPU rendering. Please see Chapter 30: Configuring SLI and
[ 21569.057] (EE) NVIDIA(GPU-0): Multi-GPU Mosaic in the README for troubleshooting
[ 21569.057] (EE) NVIDIA(GPU-0): information.
[ 21569.057] (EE) *** Aborting ***
[ 21569.057] (EE) NVIDIA(0): Failing initialization of X screen

There is no monitor connected to the 3090

Thanks

Please run nvidia-bug-report.sh as root and attach the resulting nvidia-bug-report.log.gz file to your post.

Here it is

nvidia-bug-report.log.gz (2.4 MB)

Please try commenting out
Option PrimaryGPU
Option SLI
in /etc/X11/xorg.conf.d/10-nvidia.conf
Then use your first xorg.conf
Section “Device”
Identifier “Videocard0”
Driver “nvidia”
BusID “PCI:179:0:0”
EndSection

Hmm. Did not know about that 10-nvidia.conf file. Guess it was created by the install procedure

I now have in 10-nvidia.conf

Section "OutputClass"
    Identifier "nvidia"
    MatchDriver "nvidia-drm"
    Driver "nvidia"
    Option "AllowEmptyInitialConfiguration"
   #    Option "PrimaryGPU" "yes"
   #    Option "SLI" "Auto"
    Option "BaseMosaic" "on"
EndSection

Section "OutputClass"
    Identifier "intel"
    MatchDriver "i915"
    Driver "modesetting"
EndSection

and in /etc/X11/xorg.conf

Section "Device"
    Identifier     "Videocard0"
    Driver         "nvidia"
    BusID "PCI:179:0:0"
EndSection

Section "Screen"
    Identifier     "Default Screen"
    Device         "Videocard0"
    DefaultDepth    24
    SubSection     "Display"
        Depth       24
        Modes      "nvidia-auto-select"
    EndSubSection
EndSection

Unfortunately I still get

[ 33620.779] (EE) NVIDIA(GPU-0): Failed to select a display subsystem.
[ 33620.779] (EE) NVIDIA(GPU-0): Only one GPU will be used for this X screen.
[ 33620.779] (EE) NVIDIA(GPU-0): The NVIDIA graphics device PCI:179:0:0 is part of an active
[ 33620.779] (EE) NVIDIA(GPU-0):     SLI configuration and is currently unavailable for single
[ 33620.779] (EE) NVIDIA(GPU-0):     GPU rendering.  Please see Chapter 30: Configuring SLI and
[ 33620.779] (EE) NVIDIA(GPU-0):     Multi-GPU Mosaic in the README for troubleshooting
[ 33620.779] (EE) NVIDIA(GPU-0):     information.
[ 33620.779] (EE)  *** Aborting ***
[ 33620.779] (EE) NVIDIA(0): Failing initialization of X screen
[ 33620.780] (EE) NVIDIA(G0): Only one X screen is supported when Base Mosaic is enabled.
[ 33620.780] (EE) NVIDIA(G0):     Disabling this screen.
[ 33620.780] (EE) NVIDIA(G0): Failing initialization of X screen
[ 33620.780] (EE) NVIDIA(G1): Only one X screen is supported when Base Mosaic is enabled.
[ 33620.780] (EE) NVIDIA(G1):     Disabling this screen.
[ 33620.780] (EE) NVIDIA(G1): Failing initialization of X screen
[ 33620.780] (EE) Screen(s) found, but none have a usable configuration.
[ 33620.780] (EE)
[ 33620.780] (EE) no screens found(EE)
[ 33620.780] (EE)
[ 33620.780] (EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
[ 33620.780] (EE)
[ 33620.784] (EE) Server terminated with error (1). Closing log file.

I tried setting Option SLI “off” and Option BaseMosaic “off” but that did not help.

Next time I get onsite I am just going to remove one of the Quadro’s

Removing one of the Quadros made it work with just a simple

Section "Device"
        Identifier  "Videocard0"
        Driver      "nvidia"
        BusID "PCI:101:0:0"
EndSection

and in 10-nvidia.conf we have

Section "OutputClass"
    Identifier "nvidia"
    MatchDriver "nvidia-drm"
    Driver "nvidia"
    Option "AllowEmptyInitialConfiguration"
#    Option "PrimaryGPU" "yes"
#    Option "SLI" "Auto"
    Option "SLI" "off"
#    Option "BaseMosaic" "on"
    Option "BaseMosaic" "off"
EndSection

This is fine for this box as the two Quadros was a “mistake” in purchasing and it was just supposed to have only one (Xorg runs on Quadro, ML/AI development on RTX)