Not sure if Quadro P2000 card is installed and working correctly on Linux system

My system is Redhat Enterprise Linux 7.6 (kernel 3.10.0-957.5.1.el7.x86_64), with an NVIDI Quadro P2000 5Gb 4 DP (7X20T) card. System is from DELL, based on a 7920 workstation. Dell built the system. I am having some strange graphics behavior in the applications that I am using that rely on OpenGL for graphics. Under some graphics operations the graphical window goes black. I started to look around at the installation but I am wondering if I am actually running the card / installed correctly or not.

When I look at NVIDIA X Server settings utility, it says “you do not appear to be using the NVIDIA X driver. Please edit your X configuration file (run nvidia-xconfig as root)…”. If I run nvidia-xconfig as root, the script runs fine, creating the file /usr/share/X11/xorg.conf. But then the system does not reboot. When I try toreboot the workstation it goes all the way to a grey screen, but I cannot sign on. If I reboot in single user mode, delete the recently created xorg.conf file, then all boots as usual and no issues. So I cannot use NVIDIA X Server for some reason?

I reviewed the file /var/log/nvidia-installer.log. It shows that the NVIDIA X driver was not configured for some reason. At the end of the log file it says:

done.
→ Driver file installation is complete.
→ Installing DKMS kernel module:
→ done.
→ Running post-install sanity check:
→ done.
→ Post-install sanity check passed.
→ Running runtime sanity check:
→ done.
→ Runtime sanity check passed.
→ Would you like to run the nvidia-xconfig utility to automatically update your X configuration file so that the NVIDIA X driver will be used when you restart X? Any pre-existing X configuration file will be backed up. (Answer: No)
→ Installation of the NVIDIA Accelerated Graphics Driver for Linux-x86_64 (version: 390.48) is now complete. Please update your XF86Config or xorg.conf file as appropriate; see the file /usr/share/doc/NVIDIA_GLX-1.0/README.txt for details.

When I do a “xdpyinfo” command, GLX shows up as one of the extensions, but “NV-GLX” does not.

When I edit the /var/log/Xorg.0.log and search for “EE” it finds this ominous block of text:

[ 102.597] Module class: X.Org Video Driver
[ 102.597] ================ WARNING WARNING WARNING WARNING ================
[ 102.597] This server has a video driver ABI version of 24.0 that this
driver does not officially support. Please check
http://www.nvidia.com/ for driver updates or downgrade to an X
server with a supported driver ABI.
[ 102.597] =================================================================
[ 102.597] (EE) NVIDIA: Use the -ignoreABI option to override this check.
[ 102.597] (II) UnloadModule: “nvidia”
[ 102.597] (II) Unloading nvidia
[ 102.597] (EE) Failed to load module “nvidia” (unknown error, 0)
[ 102.597] (II) LoadModule: “nouveau”
[ 102.597] (II) Loading /usr/lib64/xorg/modules/drivers/nouveau_drv.so
[ 102.599] (II) Module nouveau: vendor=“X.Org Foundation”
[ 102.599] compiled for 1.20.0, module version = 1.0.15
[ 102.599] Module class: X.Org Video Driver
[ 102.599] ABI class: X.Org Video Driver, version 24.0

When I run nvidia-smi I get this output:

±----------------------------------------------------------------------------+
| NVIDIA-SMI 390.48 Driver Version: 390.48 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Quadro P2000 Off | 00000000:73:00.0 Off | N/A |
| 44% 30C P8 7W / 75W | 0MiB / 5050MiB | 0% Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
±----------------------------------------------------------------------------+

When I do a “glxinfo” command there is no mention of nvidia anywhere.

I have the nvidia bug report output, but it is massive.

Help please, and apologies that I do not know where to start!

nvidia-bug-report.log.gz (131 KB)

Please run nvidia-bug-report.sh as root and attach the resulting .gz file to your post. Hovering the mouse over an existing post of yours will reveal a paperclip icon.
[url]https://devtalk.nvidia.com/default/topic/1043347/announcements/attaching-files-to-forum-topics-posts/[/url]

Attached. Thanks for the help!

The installed driver is too old so it isn’t even used, please use this one:
[url]https://devtalk.nvidia.com/default/topic/1047710[/url]
Stop the Xserver, use --dkms option to install. No need to create an xorg.conf.

Hi,

So hugely wrong setup. I have never done such a thing before as update a graphics driver on Linux. I have downloaded the file NVIDIA-Linux-x86_64-418.43.run. I wills start to read through the README file, but this looks pretty massive. I have installed using yum and rpm files before, and have basic Linux understanding, but am no IT expert. What are the essential important steps / sections in the README, and can it be expected that a person like me can successfully do this?

PS: I have already a recent clonezilla system image, but I hate to have to go back to that after botching the nvidia driver installation…

cat /proc/version: 3.10.0-957.5.1.el7.x86_64 OK
Xorg -version: 1.20.1 OK
insmod --version: kmod version 20 OK??
ls /lib/libc.so.*: /lib/libc.so.6 OK
pkg-config --modversion vdpau: Package vdpau was not found in the pkg-config search path.

From a yum search vdpau I get this output:

Loaded plugins: langpacks, product-id, search-disabled-repos, subscription-
: manager
============================== N/S matched: vdpau ==============================
libva-vdpau-driver.x86_64 : HW video decode support for VDPAU platforms
libvdpau-devel.i686 : Development files for libvdpau
libvdpau-devel.x86_64 : Development files for libvdpau
libvdpau-docs.noarch : Documentation for libvdpau
libvdpau-va-gl.x86_64 : VDPAU driver with OpenGL/VAAPI back-end
vdpauinfo.x86_64 : Tool to query the capabilities of a VDPAU implementation
libvdpau.i686 : Wrapper library for the Video Decode and Presentation API
libvdpau.x86_64 : Wrapper library for the Video Decode and Presentation API
mesa-vdpau-drivers.x86_64 : Mesa-based DRI drivers

Which should I install, libva-vdpau-driver.x86_64 and/or libva-vdpau-driver.x86_64 and/or libvdpau-va-gl.x86_64?

I don’t know why you’re now asking about vdpau. Which package should you install? None.
Back to the driver:
The driver on your system has previously been installed using the .run installer. If you don’t have any experience with it, it might be better if you switch to a packaged version e.g.
elrepo [url]http://elrepo.org/tiki/tiki-index.php[/url]
or negativo17 [url]https://negativo17.org/nvidia-driver/[/url]
or rpmfusion [url]https://rpmfusion.org/Howto/NVIDIA[/url]
It’s important to get rid of the old driver first by running the .run installer using the --uninstall option
sudo sh NVIDIA.run --uninstall

Hi,

At the start of chapter 2 in nvidia readme for the driver it lists vdpau as a minimum requirement… that is why I was asking. If it will work just the same, i am happy to use one if the premade installations you suggest. Which of the 3 do you recommend as best?

Ok, I understand. That is a bit misleading, though. libvdpau would be needed if an application is vdpau capable but this would be installed as a dependency of that application. So you don’t really have to think about it.
The best repository is the one that you already have added to your system. Listing your repos:
[url]https://www.cyberciti.biz/faq/centos-fedora-redhat-yum-repolist-command-tolist-package-repositories/[/url]
If none is added already, maybe go with rpmfusion nonfree, I think it’s the most widespread used one:
[url]https://rpmfusion.org/Configuration[/url]

Hi,

As I did not install the original (outdated) driver, how can I find out the version and where can I get the needed file to uninstall (or can I use the latest NVIDIA-Linux-x86_64-418.43.run that I downloaded yesterday to perform the uninstall?

sudo sh NVIDIA.run --uninstall

You can use any recent .run package to uninstall, so just take the one already downloaded.

According to /var/log/nvidia-installer.log, the driver installed on the system is version 390.48 (just one year old). So this is too old for the current card?

I will uninstall with “NVIDIA-Linux-x86_64-390.48.run -uninstall” to try to remove the old driver, then use a pre-packaged install.

Looking at Elrepo, it seems to list kmod-nvidia-346.35 as the latest they have (a lot older than 390.48?).

I used nvidia-detect, and it indicates:

Probing for supported NVIDIA devices…
[10de:1c30] NVIDIA Corporation GP106GL [Quadro P2000]
This device requires the current 418.43 NVIDIA driver kmod-nvidia

So I need the very latest driver, it seems?

According to:
[url]https://centos.pkgs.org/7/elrepo-x86_64/kmod-nvidia-418.43-1.el7_6.elrepo.x86_64.rpm.html[/url]
latest elrepo package is 418.43, which is the latest available driver released by nvidia. With your card, you should always use the latest.

Turns out it was super easy to install the latest driver. I found under Gnome UI, the latest driver was listed under Applications/System Tools/Software UI. Searched for nvidia and found “NVIDIA OpenGL X11 display driver files” which found the latest driver “nvidia-x11-drv-418.43-1.el7_6.elrepo”. I clicked “update”, then rebooted and viola all is working. glxinfo reports Nvidia. /var/log/Xorg.0.log shows Nvidia used with no error, and nvidia-smi reports nvidia in use.

And my previous graphics artifacts / black screens are gone!

I was earlier confused when I went to elrepo.org and it does not show new drivers. Poking around there again I still cannot find them. But I see you found them in centos.pkgs.org, though.

The software mgmt system under Gnome on my RHEL system found the latest nvidia driver and it all just worked with one click (installed nvidia driver from elrepo, plus dependencies from rhel-7-workstation-rpms including vdpau, vulcan, glvnd-opengl etc). Phew!

Thanks for all of the help and advice.

Yikes! OK, my Nvidia graphics card is now running properly it seems. But in the process of updating the nvidia driver with nvidia-x11-drv-418.43-1.el7_6.elrepo, I HAVE NOW LOST CPU’s 4 - 95 of my system (only CPU’s 0 - 3 are online).

What might have happened? Does the Nvidia driver install update some setting on CPU’s?

Help please!

Installing a graphics driver shouldn’t have that kind of impact.
Please create and attach a new nvidia-bug-report.log

I rebooted a second time and all cores are there. No idea what happened, but seems to have resolved itself.