Nvidia-xconfig does not correctly configure multiple GPUs

Hi,

I installed four NVIDIA GPUs (1e30) Quadro RTX 8000 and followed all the major steps needed to enable NVIDIA drivers which include the following:

  1. Enable nvidia in early KMS with GRUB and initramfs
    BOOT_IMAGE=/boot/vmlinuz-5.4.0-70-generic root=UUID=2669e54b-187c-4de6-a661-0c2eda159296 ro quiet splash nogpumanager nvidia-drm.modeset=1 vt.handoff=1

  2. Suppress nouveau by blacklisting in *.conf files

  3. Disable gpumanager

  4. Disable Wayland

  5. Run in multiuser level, i.e. runlevel 3

  6. Installed nvidia-driver-460 (460.67) using apt followed by a system reboot

After rebooting I find that nvidia drivers are enabled in most of the places but not everywhere. I tried running nvidia-xconfig to generate an xorg.conf (or 20-nvidia.conf) file but none of the settings provided at the NVIDIA developer forum or in the Xorg config section of NVIDIA driver documentation helped me. These include

  1. Adding BusID to “Device” section in Xorg.conf
  2. Adding “AllowEmptyInitialConfiguration”

Rather I found a 72x72 black screen each time I started gdm (systemctl start gdm) with nvidia config.

I need to enable nvidia drivers instead of mesa and also enable SLI. How should I proceed?

nvidia-bug-report.log.gz (950.3 KB)

Report:

  • nvidia-smi >

Thu Apr 1 14:01:29 2021
±----------------------------------------------------------------------------+
| NVIDIA-SMI 460.67 Driver Version: 460.67 CUDA Version: 11.2 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Quadro RTX 8000 Off | 00000000:88:00.0 Off | Off |
| 33% 23C P8 12W / 260W | 10MiB / 48601MiB | 0% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+
| 1 Quadro RTX 8000 Off | 00000000:89:00.0 Off | Off |
| 33% 25C P8 15W / 260W | 10MiB / 48601MiB | 0% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+
| 2 Quadro RTX 8000 Off | 00000000:B1:00.0 Off | Off |
| 33% 24C P8 7W / 260W | 10MiB / 48601MiB | 0% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+
| 3 Quadro RTX 8000 Off | 00000000:B2:00.0 Off | Off |
| 33% 24C P8 10W / 260W | 10MiB / 48601MiB | 0% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 3049 G /usr/lib/xorg/Xorg 4MiB |
| 0 N/A N/A 3559 G /usr/lib/xorg/Xorg 4MiB |
| 1 N/A N/A 3049 G /usr/lib/xorg/Xorg 4MiB |
| 1 N/A N/A 3559 G /usr/lib/xorg/Xorg 4MiB |
| 2 N/A N/A 3049 G /usr/lib/xorg/Xorg 4MiB |
| 2 N/A N/A 3559 G /usr/lib/xorg/Xorg 4MiB |
| 3 N/A N/A 3049 G /usr/lib/xorg/Xorg 4MiB |
| 3 N/A N/A 3559 G /usr/lib/xorg/Xorg 4MiB |
±----------------------------------------------------------------------------+

  • glxinfo -B >
    name of display: :1
    display: :1 screen: 0
    direct rendering: Yes
    Extended renderer info (GLX_MESA_query_renderer):
    Vendor: VMware, Inc. (0xffffffff)
    Device: llvmpipe (LLVM 10.0.0, 256 bits) (0xffffffff)
    Version: 20.0.8
    Accelerated: no
    Video memory: 193089MB
    Unified memory: no
    Preferred profile: core (0x1)
    Max core profile version: 3.3
    Max compat profile version: 3.1
    Max GLES1 profile version: 1.1
    Max GLES[23] profile version: 3.1
    OpenGL vendor string: VMware, Inc.
    OpenGL renderer string: llvmpipe (LLVM 10.0.0, 256 bits)
    OpenGL core profile version string: 3.3 (Core Profile) Mesa 20.0.8
    OpenGL core profile shading language version string: 3.30
    OpenGL core profile context flags: (none)
    OpenGL core profile profile mask: core profile

OpenGL version string: 3.1 Mesa 20.0.8
OpenGL shading language version string: 1.40
OpenGL context flags: (none)

OpenGL ES profile version string: OpenGL ES 3.1 Mesa 20.0.8
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.10


inxi -G > hard to paste here but the drivers listed are nvidia, nvidia, nvidia, nvidia


lshw -c video| grep configuration >

   configuration: driver=mgag200 latency=0
   configuration: driver=nvidia latency=0
   configuration: driver=nvidia latency=0
   configuration: driver=nvidia latency=0
   configuration: driver=nvidia latency=0

nvidia-smi -a > very long but the important bit here is that the Display Mode is Disabled, Display Active is Disabled

lsmod | grep nvidia >

nvidia_uvm 983040 0
nvidia_drm 53248 16
nvidia_modeset 1228800 5 nvidia_drm
nvidia 34123776 562 nvidia_uvm,nvidia_modeset
drm_kms_helper 188416 4 mgag200,nvidia_drm
drm 491520 24 drm_kms_helper,drm_vram_helper,mgag200,nvidia_drm,ttm
i2c_nvidia_gpu 16384 0

prime-select query >
nvidia