nvidia-xconfig doesnt do what i want it to, nor does nvidia-settings

Hi,

I installed four NVIDIA GPUs (1e30) Quadro RTX 8000 and followed all the major steps needed to enable nvidia and cuda drivers which include the following:

- Enable nvidia in early KMS with GRUB and initramfs
BOOT_IMAGE=/boot/vmlinuz-5.4.0-70-generic root=UUID=2669e54b-187c-4de6-a661-0c2eda159296 ro quiet splash nogpumanager nvidia-drm.modeset=1 vt.handoff=1

- Suppress nouveau by blacklisting in *.conf files

- Disable gpumanager

- Disable Wayland

- Run in multi-user level, i.e. runlevel 3

- Install cuda:

apt install linux-headers-(uname -r) wget https://developer.download.nvidia.com/compute/cuda/11.2.2/local_installers/cuda_11.2.2_460.32.03_linux.run chmod +x <runfile> sudo sh cuda_11.2.2_460.32.03_linux.run Add: ~/.bashrc - export PATH=/usr/local/cuda-11.2/bin{PATH:+:{PATH}} - export LD_LIBRARY_PATH=/usr/local/cuda-11.2/lib{LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

After rebooting I find that nvidia drivers are enabled in most of the places but not everywhere. I tried running nvidia-xconfig to generate an xorg.conf (or 20-nvidia.conf) file but none of the settings provided at the NVIDIA developer forum or in the Xorg config section of NVIDIA driver documentation helped me. These include (see attached xorg.conf file)

xorg.conf.txt (2.4 KB)

Adding BusID to “Device” section in Xorg.conf
Adding “AllowEmptyInitialConfiguration”

Rather I found a 72x72 black screen each time I started gdm (systemctl start gdm) with nvidia config.

nvidia-bug-report.log.gz (952.0 KB)

I need to enable nvidia drivers instead of mesa, and also enable SLI. How should I proceed?

There’s no monitor connected to the Quadros, only one on the server graphics, what do you want to use the quadros for?

To enable SLI for VRWorks, and to enable multi-GPU computation for video processing. The monitor is connected to the server by a VGA cable. Do I need to unplug the VGA cable if I were to plug in a Displayport/HDMi into the Quadro?

Ok, first of all, the SLI modes in xorg.conf are removed, only SLIMosaic (for videowalls) is left. Doesn’t matter though since those are not the robots you’re looking for. VRWorks uses the newer explicit multi-gpu rendering which doesn’t need to be explicitly configured. If your board supports it, the quadros should form an sli-group automatically.
To get your config working, you should remove the xorg.conf, disable the server graphics in bios and then connect a monitor to the primary nvidia gpu, i.e. connect the monitor to the nvidia cards one after another until you see the login-screen.
For config, use a minimal one like
/etc/X11/xorg.conf.d/10-nvidia.conf

Section "OutputClass"
  Identifier "nvidia"
  MatchDriver "nvidia-drm"
  Driver "nvidia"
  Option "AllowEmptyInitialConfiguration" "true"
EndSection
1 Like

Thanks for the update on SLI. I have been running into the same problem as described here

cudaErrorUnknown / cudaGraphicsGLRegisterBuffer - #6 by th6mas.

After reading the replies, I have a few things to check with you:

  1. In order for CUDA (and CUDA based apps) to run OpenGL applications, OpenGL must be created with NVIDIA context. Is that correct?

  2. I am using my Quadros for compute only, not for rendering (it is not plugged to my monitor). I have a non-NVIDIA graphics card rendering - XOrg with modesetting, nvidia.The monitor is connected to the internal graphics card with a VGA cable. So for OpenGL context to be created with nvidia, do I need to unplug the VGA cable, and plug my monitor to a Quadro (with HDMI or DP), like you described in the previous reply, and then start XOrg with nvidia accelerated graphics?

  • In other words, is it incorrect to use Quadro for running CUDA-based applications (that use GL), while the rendering/display is being performed by a non-NVIDIA graphics card? Because I believe, the GL context is created with a non-NVIDIA context that cannot run CUDA samples (e.g. 2_Graphics/simpleGL), and for that to happen the Quadros must perform BOTH computation and graphics. Please correct me if I am wrong here.
  1. Nvidia runfile installation conflicts with package manager installation. But would that also conflict with mesa package manager installation? Some of the packages may install mesa drivers (GL, etc). Would that conflict with nvidia GL libraries in /usr/lib/x86_64-linux-gnu/ ?

Correct, direct gl/cuda interop requires rendering happening on the nvidia gpu i.e. it running an xserver. It possible to use virtualgl to have the desktop running on the server graphics but this adds a lot of complexity.

On any fairly recent linux distribution which is using libglvnd for glx switching, the runfile installer isn’t conflicting with mesa anymore, also applies to the package manager in general. It’s just important that no nvidia packages have been installed through the package manager before using the runfile installer.
On hybrid graphics notebooks, this doesn’t apply but only since those need additional config provided by the distribution’s repositories.

1 Like