nvidia-xconfig doesnt do what i want it to, nor does nvidia-settings

akshitij · April 5, 2021, 10:33pm

Hi,

I installed four NVIDIA GPUs (1e30) Quadro RTX 8000 and followed all the major steps needed to enable nvidia and cuda drivers which include the following:

- Enable nvidia in early KMS with GRUB and initramfs
BOOT_IMAGE=/boot/vmlinuz-5.4.0-70-generic root=UUID=2669e54b-187c-4de6-a661-0c2eda159296 ro quiet splash nogpumanager nvidia-drm.modeset=1 vt.handoff=1

- Suppress nouveau by blacklisting in *.conf files

- Disable gpumanager

- Disable Wayland

- Run in multi-user level, i.e. runlevel 3

- Install cuda:

apt install linux-headers-$(uname -r)
wget https://developer.download.nvidia.com/compute/cuda/11.2.2/local_installers/cuda_11.2.2_460.32.03_linux.run
chmod +x
sudo sh cuda_11.2.2_460.32.03_linux.run
Add: ~/.bashrc
- export PATH=/usr/local/cuda-11.2/bin${PATH:+:${PATH}}
- export LD_LIBRARY_PATH=/usr/local/cuda-11.2/lib${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

After rebooting I find that nvidia drivers are enabled in most of the places but not everywhere. I tried running nvidia-xconfig to generate an xorg.conf (or 20-nvidia.conf) file but none of the settings provided at the NVIDIA developer forum or in the Xorg config section of NVIDIA driver documentation helped me. These include (see attached xorg.conf file)

xorg.conf.txt (2.4 KB)

Adding BusID to “Device” section in Xorg.conf
Adding “AllowEmptyInitialConfiguration”

Rather I found a 72x72 black screen each time I started gdm (systemctl start gdm) with nvidia config.

nvidia-bug-report.log.gz (952.0 KB)

I need to enable nvidia drivers instead of mesa, and also enable SLI. How should I proceed?

generix · April 6, 2021, 7:01am

There’s no monitor connected to the Quadros, only one on the server graphics, what do you want to use the quadros for?

akshitij · April 6, 2021, 3:40pm

To enable SLI for VRWorks, and to enable multi-GPU computation for video processing. The monitor is connected to the server by a VGA cable. Do I need to unplug the VGA cable if I were to plug in a Displayport/HDMi into the Quadro?

generix · April 6, 2021, 4:16pm

Ok, first of all, the SLI modes in xorg.conf are removed, only SLIMosaic (for videowalls) is left. Doesn’t matter though since those are not the robots you’re looking for. VRWorks uses the newer explicit multi-gpu rendering which doesn’t need to be explicitly configured. If your board supports it, the quadros should form an sli-group automatically.
To get your config working, you should remove the xorg.conf, disable the server graphics in bios and then connect a monitor to the primary nvidia gpu, i.e. connect the monitor to the nvidia cards one after another until you see the login-screen.
For config, use a minimal one like
/etc/X11/xorg.conf.d/10-nvidia.conf

Section "OutputClass"
  Identifier "nvidia"
  MatchDriver "nvidia-drm"
  Driver "nvidia"
  Option "AllowEmptyInitialConfiguration" "true"
EndSection

akshitij · April 9, 2021, 6:40pm

Thanks for the update on SLI. I have been running into the same problem as described here

cudaErrorUnknown / cudaGraphicsGLRegisterBuffer - #6 by th6mas.

After reading the replies, I have a few things to check with you:

In order for CUDA (and CUDA based apps) to run OpenGL applications, OpenGL must be created with NVIDIA context. Is that correct?
I am using my Quadros for compute only, not for rendering (it is not plugged to my monitor). I have a non-NVIDIA graphics card rendering - XOrg with modesetting, nvidia.The monitor is connected to the internal graphics card with a VGA cable. So for OpenGL context to be created with nvidia, do I need to unplug the VGA cable, and plug my monitor to a Quadro (with HDMI or DP), like you described in the previous reply, and then start XOrg with nvidia accelerated graphics?

In other words, is it incorrect to use Quadro for running CUDA-based applications (that use GL), while the rendering/display is being performed by a non-NVIDIA graphics card? Because I believe, the GL context is created with a non-NVIDIA context that cannot run CUDA samples (e.g. 2_Graphics/simpleGL), and for that to happen the Quadros must perform BOTH computation and graphics. Please correct me if I am wrong here.

Nvidia runfile installation conflicts with package manager installation. But would that also conflict with mesa package manager installation? Some of the packages may install mesa drivers (GL, etc). Would that conflict with nvidia GL libraries in /usr/lib/x86_64-linux-gnu/ ?

generix · April 9, 2021, 7:33pm

Correct, direct gl/cuda interop requires rendering happening on the nvidia gpu i.e. it running an xserver. It possible to use virtualgl to have the desktop running on the server graphics but this adds a lot of complexity.

On any fairly recent linux distribution which is using libglvnd for glx switching, the runfile installer isn’t conflicting with mesa anymore, also applies to the package manager in general. It’s just important that no nvidia packages have been installed through the package manager before using the runfile installer.
On hybrid graphics notebooks, this doesn’t apply but only since those need additional config provided by the distribution’s repositories.

spydronexbow · May 8, 2021, 7:10pm

Dear Generix,

First of all, thank you for your help with setting up my Lenovo Legion 5 15ARH05, with a AMD Ryzen 4800H APU and an NVidia GTX1660Ti.
I am using Arch Linux (kernel 5.12.1-arch1-1) with SDDM and KDE Plasma under X11.
As the last part of your answer was written for GDM, I had to look around for my setup.
I’ve found the solutions for SDDM and LightDM, which I’d like to share:

LightDM: NVIDIA Optimus - ArchWiki
SDDM: NVIDIA Optimus - ArchWiki

I hope this helps people that found this very useful thread.

hellyerc74 · May 18, 2021, 4:39pm

Hello nvidia,

Have been fighting with this for a few days, but it seems like every time I solve one problem, I encounter a different one. Initially, my two ViewSonic monitors were working, attached with Prudent Way PWI-USB-HDV adapters, but there was no sound from my HDMI connected ASUS monitor, and the video was choppy on all three. After following a suggestion in another topic on the forum, I downgraded the driver package from 460 to 390 (the closest one to being below 410) to resolve the sound issue, which worked. It was stated there that there are problems with HDMI and sound above the 410 driver.

To solve the problems with the choppy video, I changed the settings to “maximum performance,” which worked great to solve the issues with choppy video on the main HDMI ASUS screen after the restart, however the two ViewSonic screens are no longer detected. And, it appears the settings which solved the choppy video didn’t survive the restart, even though I saved the changes to the configuration file, but the other two monitors still aren’t detected.

I attached my bug report log for your inspection. Any help you could give me to solve this would be greatly appreciated!

Thanks!

nvidia-bug-report.log.gz (213.1 KB)

hellyerc74 · May 18, 2021, 4:43pm

The previously attached bug report was prior to the most recent restart. Attached is the one after the most recent restart.

nvidia-bug-report.log.gz (236.1 KB)

jlacarneiro · July 14, 2021, 9:56pm

Hello,

I have a Lenovo Legion Y520 that worked fine with Windows 10 AND Ubuntu 20.04 (on dual boot) until last week.
Although I basically use Ubuntu, from time to time I use Windows to keep it updated. Unfortunately, last week’s update updated BIOS and messed up with my Linux config. I think maybe the BIOS update reenabled secure boot. But, before I I came to this conclusion I tried uninstalling drivers and such and must have messed up with my linux config…

Nvidia still works fine under W10, but under Ubuntu I can only use Intel (IGPU?). I’ve been struggling for a few days now. I even acessed another thread in this forum (Version 460 nvidia driver stopped loading overnight under ubuntu 20.04 - #12 by jlacarneiro) to no avail.

Since it was taking too long, and various sites mentioned that nVidia worked fine under Ubuntu 21.04 using Wayland, I upgraded my Linux box to 20.10 and then to 21.04. But it still doesn’t work. If I tweak it too much, the notebook simply hangs after boot.

Can you help me?

nvidia-bug-report.log.gz (450.8 KB)