CentOS 7 headless with nVidia drivers installed, OpenGL not using nVidia drivers, only llvmpipe

Hello! This is similar to a previous ticket I filed, though in this case I’m not dealing with many nvidia GPUs.
The goal is to be using nVidia OpenGL via TurboVNC Server, so that applications can have access to the latest OpenGL, accelerated on the nVidia A30.

I have a system with one nVidia GPU, and an integrated Matrox display that I have disabled in the EFI config though it still shows up in lspci|grep -i vga.

I have installed the nVidia drivers via the .run file, and can run TurboVNC server.
However, when I run glxinfo, I get:

name of display: :1.0
display: :1 screen: 0
direct rendering: Yes
server glx vendor string: SGI
server glx version string: 1.4
server glx extensions:
GLX_ARB_context_flush_control, GLX_ARB_create_context,
GLX_ARB_create_context_no_error, GLX_ARB_create_context_profile,
GLX_ARB_fbconfig_float, GLX_ARB_framebuffer_sRGB, GLX_ARB_multisample,
GLX_EXT_create_context_es2_profile, GLX_EXT_create_context_es_profile,

glxinfo|grep OpenGL gets
OpenGL vendor string: VMware, Inc.
OpenGL renderer string: llvmpipe (LLVM 7.0, 256 bits)
OpenGL version string: 2.1 Mesa 18.3.4
OpenGL shading language version string: 1.20
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 2.0 Mesa 18.3.4
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 1.0.16
OpenGL ES profile extensions:

I’ve attached the bug report file.
nvidia-bug-report-03042022.log.gz (515.4 KB)

Let me know if there’s anything else additional I should send, or how best to proceed.
My goal is to allow users connecting to the system to run applications which need a minimum OpenGL version of 3.2

You need to install and use VirtualGL.

Yep! I have that installed, using it, still get the same result within it.

Interesting thing is, running glxgears on command line via X-Forwarding, I get 8fps, and running it within a TurboVNC session, I do get between 430-530fps, but an OpenGL app still runs extremely choppily in the TurboVNC, as if it’s using software rendering, and again, the OpenGL that glxinfo notes is from Mesa and not Nvidia

I can add, here’s the output of inxi -Gx:

inxi -Gx

Graphics:
Device-1: Matrox Systems Integrated Matrox G200eW3 Graphics vendor: Dell
driver: N/A bus-ID: 03:00.0
Device-2: NVIDIA driver: nvidia v: 510.54 bus-ID: ca:00.0
Display: server: X.Org 1.20.13 driver: loaded: nvidia
resolution: 2560x1440~60Hz
OpenGL: renderer: llvmpipe (LLVM 7.0 256 bits) v: 2.1 Mesa 18.3.4
direct render: Yes

Note, the Matrox device appears despite it being disabled in the BIOS

Also of note, the system is running using UEFI, not BIOS. Would that make a difference?

I guess the Xserver is currently running on the Matrox. To have it run on the Nvidia for VirtualGL to have effect, create a basic /etc/X11/xorg.conf

Section "Device"
  Identifier "nvidia"
  Driver "nvidia"
  BusID "PCI:202:0:0"
EndSection

Tried that, no dice so far.

In /var/log/Xorg.0.log, I see:

[ 1272.051] (II) Initializing extension GLX
[ 1272.051] (II) Initializing extension GLX
[ 1272.051] (II) GLX: Another vendor is already registered for screen 0

Other interesting things:
In /var/log/messages:

messages:Apr 7 16:59:37 takim2 ansible-command: Invoked with creates=None executable=None _uses_shell=True strip_empty_ends=True _raw_params=glxinfo|grep OpenGL removes=None argv=None warn=True chdir=None stdin_add_newline=True stdin=None
messages:Apr 8 13:33:02 takim2 kernel: [ 266.633313] glxinfo[4012]: segfault at 70 ip 00007f007a44c152 sp 00007fffa63f5f38 error 6 in libGLX_mesa.so.0.0.0[7f007a408000+73000]
messages:Apr 8 13:33:02 takim2 audispd: node=takim2 type=ANOM_ABEND msg=audit(1649439182.126:95): auid=37869 uid=37869 gid=37869 ses=1 pid=4012 comm=“glxinfo” reason=“memory violation” sig=11
messages:Apr 8 13:33:18 takim2 audispd: node=takim2 type=ANOM_ABEND msg=audit(1649439198.822:96): auid=37869 uid=37869 gid=37869 ses=1 pid=4055 comm=“glxinfo” reason=“memory violation” sig=11
messages:Apr 8 13:33:18 takim2 kernel: [ 283.289666] glxinfo[4055]: segfault at 70 ip 00007fe247483152 sp 00007fff4670d8e8 error 6 in libGLX_mesa.so.0.0.0[7fe24743f000+73000]
messages:Apr 8 13:33:29 takim2 kernel: [ 294.263007] glxinfo[4097]: segfault at 70 ip 00007fcf50b50152 sp 00007ffcb5d17858 error 6 in libGLX_mesa.so.0.0.0[7fcf50b0c000+73000]
messages:Apr 8 13:33:29 takim2 audispd: node=takim2 type=ANOM_ABEND msg=audit(1649439209.821:97): auid=37869 uid=37869 gid=37869 ses=1 pid=4097 comm=“glxinfo” reason=“memory violation” sig=11
messages:Apr 8 13:34:50 takim2 kernel: [ 374.344422] glxinfo[4191]: segfault at 70 ip 00007f6b07e30152 sp 00007fff079b7718 error 6 in libGLX_mesa.so.0.0.0[7f6b07dec000+73000]
messages:Apr 8 13:34:50 takim2 audispd: node=takim2 type=ANOM_ABEND msg=audit(1649439290.090:101): auid=37869 uid=37869 gid=37869 ses=1 pid=4191 comm=“glxinfo” reason=“memory violation” sig=11
messages:Apr 8 13:36:42 takim2 audispd: node=takim2 type=ANOM_ABEND msg=audit(1649439402.638:102): auid=37869 uid=37869 gid=37869 ses=1 pid=4468 comm=“glxinfo” reason=“memory violation” sig=11
messages:Apr 8 13:36:42 takim2 kernel: [ 486.626017] glxinfo[4468]: segfault at 70 ip 00007f9e5cb3a152 sp 00007fff0f9a3188 error 6 in libGLX_mesa.so.0.0.0[7f9e5caf6000+73000]

Also wondering if there’s a version mismatch:

[root@takim2 log]# grep 510 Xorg.0.log
[ 1269.883] (II) NVIDIA dlloader X Driver 510.47.03 Mon Jan 24 23:02:31 UTC 2022
[ 1269.917] (II) NVIDIA GLX Module 510.47.03 Mon Jan 24 22:57:16 UTC 2022

Comparing this with nvidia-settings, that shows:
Nvidia Driver Version: 510.54

and nvidia-smi shows the newer version as well:

nvidia-smi

Fri Apr 8 16:20:36 2022
±----------------------------------------------------------------------------+
| NVIDIA-SMI 510.54 Driver Version: 510.54 CUDA Version: 11.6

Here’s the result of yum list installed:
yum list installed|grep -i nvidia
cuda.x86_64 1:11.6.55-1.el7 @pmacs-nvidia
cuda-cudart.x86_64 1:11.6.55-2.el7 @pmacs-nvidia
cuda-libs.x86_64 1:11.6.55-1.el7 @pmacs-nvidia
cuda-nvrtc.x86_64 1:11.6.55-1.el7 @pmacs-nvidia
egl-wayland.x86_64 1.1.7-1.el7 @pmacs-nvidia
kmod-nvidia.x86_64 3:510.54-1.el7 @pmacs-nvidia
libcublas.x86_64 1:11.8.1.74-1.el7 @pmacs-nvidia
libcufft.x86_64 2:10.7.0.55-1.el7 @pmacs-nvidia
libcufile.x86_64 1:1.2.0.100-1.el7 @pmacs-nvidia
libcurand.x86_64 2:10.2.9.55-1.el7 @pmacs-nvidia
libcusolver.x86_64 2:11.3.2.55-1.el7 @pmacs-nvidia
libcusparse.x86_64 1:11.7.1.55-1.el7 @pmacs-nvidia
libnpp.x86_64 1:11.6.0.55-1.el7 @pmacs-nvidia
libnvjpeg.x86_64 1:11.6.0.55-1.el7 @pmacs-nvidia
libvdpau.x86_64 1.4-10.el7 @pmacs-nvidia
nvidia-driver.x86_64 3:510.54-3.el7 @pmacs-nvidia
nvidia-driver-cuda-libs.x86_64 3:510.54-3.el7 @pmacs-nvidia
nvidia-driver-libs.x86_64 3:510.54-3.el7 @pmacs-nvidia
nvidia-kmod-common.noarch 3:510.54-1.el7 @pmacs-nvidia
nvidia-libXNVCtrl.x86_64 3:510.54-1.el7 @pmacs-nvidia
nvidia-query-resource-opengl.x86_64
nvidia-query-resource-opengl-lib.x86_64
nvidia-settings.x86_64 3:510.54-1.el7 @pmacs-nvidia

More data:
Not sure why VMware would be the GLX vendor:

glxinfo -B
name of display: :1.0
display: :1 screen: 0
direct rendering: Yes
Extended renderer info (GLX_MESA_query_renderer):
Vendor: VMware, Inc. (0xffffffff)
Device: llvmpipe (LLVM 7.0, 256 bits) (0xffffffff)
Version: 18.3.4
Accelerated: no
Video memory: 257242MB
Unified memory: no
Preferred profile: compat (0x2)
Max core profile version: 0.0
Max compat profile version: 2.1
Max GLES1 profile version: 1.1
Max GLES[23] profile version: 2.0
OpenGL vendor string: VMware, Inc.
OpenGL renderer string: llvmpipe (LLVM 7.0, 256 bits)
OpenGL version string: 2.1 Mesa 18.3.4
OpenGL shading language version string: 1.20

OpenGL ES profile version string: OpenGL ES 2.0 Mesa 18.3.4
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 1.0.16

I may have an idea of another factor.
We have two internal repos in use that might have conflicting package, or packages that have created issues.

There’s a base repo, and an nvidia repo.
When I do yum list installe|grep mesa, I get:

mesa-dri-drivers.x86_64 18.3.4-12.el7_9 @pmacs-updates
mesa-filesystem.x86_64 18.3.4-12.el7_9 @pmacs-updates
mesa-khr-devel.x86_64 18.3.4-12.el7_9 @pmacs-updates
mesa-libEGL.x86_64 18.3.4-12.el7_9 @pmacs-updates
mesa-libGL.x86_64 18.3.4-12.el7_9 @pmacs-updates
mesa-libGL-devel.x86_64 18.3.4-12.el7_9 @pmacs-updates
mesa-libGLES.x86_64 18.3.4-12.el7_9 @pmacs-updates
mesa-libGLU.x86_64 9.0.0-4.el7 @pmacs-base
mesa-libGLU-devel.x86_64 9.0.0-4.el7 @pmacs-base
mesa-libGLw.x86_64 8.0.0-5.el7 @pmacs-base
mesa-libGLw-devel.x86_64 8.0.0-5.el7 @pmacs-base
mesa-libgbm.x86_64 18.3.4-12.el7_9 @pmacs-updates
mesa-libglapi.x86_64 18.3.4-12.el7_9 @pmacs-updates
mesa-libxatracker.x86_64 18.3.4-12.el7_9 @pmacs-updates

Should I not have these at all on my system, given the goal I’m trying to accomplish?
Thanks!

Please attach the full Xorg.0.log

Sorry, I hadn’t initially seen that request, since then I’ve done some uninstalls and reinstalls, but I believe I’m back where I started.

I have attached the current Xorg.0.log
Xorg.0.log (22.2 KB)

The Xserver is fine. I guess you either didn’t install VirtualGL correctly or didn’t read the manual on how to use it
vglrun glxinfo

I’ve installed it, run vglconnect user@server , then in that vglconnect session, run vglrun glxinfo|head, and I get

vglrun glxinfo

[VGL] NOTICE: Automatically setting VGL_CLIENT environment variable to
[VGL] 172.16.3.158, the IP address of your SSH client.

and it just hangs.

When I do, as the VirtualGL manual suggests, as a sanity check,

glxinfo -display :0

It does indeed show the NVIDIA driver and NVIDIA OpenGL being used.

Is there a soup-to-nuts tutorial that goes through everything needed, from installing the nvidia drivers, to setting up VirtualGL, etc, in a proven way that will work?

Or is there any, possible, way that I could get guided support from someone whom I could do screen share, so I can directly show every bit of the system, describe over the phone what I have and what I’m trying to accomplish, etc etc?

I guess you made more of it than it really is. VirtualGL setup from repo is usually a plug-and-play thing. Install, connect to vncserver over your favourite vnc viewer, use vglrun. Done.
vglconnect is for when the nvidia gpu is on a different host. So it’s likely trying to connect to your client, which doesn’t work.

Hmm, so installing VirtualGL automatically runs a VNCserver on display :0, and I connect to that?


I understand this is supposed to be plug and play, but it just hasn’t worked out that way for me, or I have not found the correct instructions.

I understand this is supposed to be simple, and I did read the manual and tried to follow along, but just bear with me, and let me know the specifics

No. VirtualGL is the binding link between a software vncserver and a hardware Xserver.
In your first post you said you have a turbovnc server running?

Yep! I’ve tried that,

I start the server,
connect to it from my laptop.

Within that, if I run

vglrun glxinfo, it hangs after only outputting
name of display: :1.0

does
vglrun -d :0 glxinfo | grep -i opengl
work?

Nope. it hangs, in the same way.

I thought I configured the system properly using vglserver_config (I answered No to all of the questions at first, which I know is insecure, but it was merely for testing, and since I have to sort out how to ensure all clients will be in the vglusers local group, considering they’re domain users)

Please post the output of
ps a |grep X