Thanks a million!!! You have been way more helpful than the RHEL support. The server finally booted up into GUI and I was able to login to the GUI on the server but not through the remote Dell iDRAC.
I do want to know that when I type nvidia-smi, I see the Nvidia 430 driver and 10.1 cuda but should there be something displaying in the processes section? I see ‘no processes found’ but I remember there was something before. Do I need to add/install anything more?
nvidia-smi
Tue Aug 20 13:42:53 2019
±----------------------------------------------------------------------------+
| NVIDIA-SMI 430.40 Driver Version: 430.40 CUDA Version: 10.1 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P100-PCIE… Off | 00000000:03:00.0 Off | 0 |
| N/A 36C P0 25W / 250W | 4MiB / 16280MiB | 0% Default |
±------------------------------±---------------------±---------------------+
±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
±----------------------------------------------------------------------------+
If the xorg.conf got deleted and the monitor is connected to the onboard matrox server graphics, the xserver might now run on that. Connect the monitor to the nvidia outputs and use a minimal xorg.conf like
The xorg.conf got removed during the update so there’s now the Xserver running on the Matrox. I don’t know about your use case for the nvidia gpu, sounds like you’re connecting only over xrdp, which is a virtual Xserver? If you want/need an Xserver running on the nvidia, just use the /etc/X11/xorg.conf from post #22.
I created /etc/X11/xorg.conf as mentioned in post #22 andthe entire gui was gone. I removed it and the GUI came back. I will check with Dell to see if they can help anyway because I cannot even access the boot options when I restart the server. I will provide an update once I hear from them. Thank you very much.
The server is connected through KVM to a vga output and it has always been that way. The KVM doesn’t have the option to connect to nvidia card. The issue is that I was able to access BIOS before through iDRAC but now I’m not able to do that either. Dell tech support engineer suggested to reseat the nvram and we plan to do that and see if it works. Appreciate your help very much, I will keep you posted. Thanks!
I have scheduled the downtime to reseat nvram for tomorrow morning. I think the primary card is coming up as Intel and not Nvidia. I installed Mayavi for the users and when I launch Mayavi, we get the following message.
ERROR: In /work/standalone-x64-build/VTK-source/Rendering/OpenGL2/vtkOpenGLRenderWindow.cxx, line 797
vtkXOpenGLRenderWindow (0x55e1822a25b0): GL version 2.1 with the gpu_shader4 extension is not supported by your graphics driver but is required for the new OpenGL rendering backend. Please update your OpenGL driver. If you are using Mesa please make sure you have version 10.6.5 or later and make sure your driver in Mesa supports OpenGL 3.2.
ERROR: In /work/standalone-x64-build/VTK-source/Rendering/OpenGL2/vtkShaderProgram.cxx, line 445
vtkShaderProgram (0x55e17d111710): 1: #version 120
I will post the out put of glxinfo in the morning once I have physical access to the server. Thanks!
With your current setup, I suspect this is expected. Let me put this straight:
Local access:
the monitor is connected to the onboard vga connector which is driven by a “Matrox mga200 server graphics”. This is a simple framebuffer device which only uses mesa software rendering. I suspect RHEL 7 uses Mesa 18 which only reports OpenGL 3.1 for the software renderer. This can be overridden but that’s another story.
Since this simple Matrox device, or better said its driver, doesn’t support output redirection, the nvidia card is not used for graphics in the current setup. It could be used for Cuda but taking a glance at Mayavi, it doesn’t seem to support this. Which leaves me a bit puzzled, why is there an nvidia card in that box if it’s not used anyway? Is that box only used locally or is there a remote graphics use case, e.g. over xrdp+tigervnc+virtualgl or NoMachine?
The iDRAC access is now fixed and I can control the server remotely. You are correct, the server is extensively used over xrdp and they set the session to xorg. When they had RDP issues they connected using SSH tunneling with X11 forwarding. The users wanted a good graphics card as they use Matlab for their analysis and my Dell rep suggested to go with Nvidia. Which additional drivers do I have to install to get rid of these errors? Thanks!
locally: connect the monitor to the nvidia card and use the xorg.conf I posted.
remotely: do the same, then install and configure virtualgl and use ‘vglrun ’ inside the xrdp session.
Excuse my delay… I went to connect the monitor to nvidia card but there is no graphical output for the Tesla P100 card. I reached out to the vendor, Dell in my case, and this is what they said:
“The Tesla P100 GPGPU (General Processing GPU) does not have a graphics port. These cards tend to not have direct attach graphics unlike their consumer counter parts (GTX series cards). These cards are compute units that are optimized for data/calculation processing.”
That is why I think when I used xorg.conf you suggested the server didn’t boot into GUI. Once I removed it, the server came backup in graphical mode. VGA works fine so I want to ask - should I install and configure virtualgl and use ‘vglrun ’ inside the xrdp session? Thank you.
Sorry, my bad, Teslas of course have no outputs. Regardless of the thread title I somehow thought you were running a Quadro.
To get HW accelerated graphics from the tesla, you will have to run a second Xserver on it and configure virtualgl to use that. Then you can use vglrun locally and remote to get hw accel.
Just as a note: if your remote users are using forwarded X11 over ssh, this wont work since then indirect GL is used and rendered on the client machine. So this will only work over rdp/vnc.
Actually they are not using X11 forwarding just xrdp. They used X11 forwarding when the GUI was not showing up and you helped me get that fixed. Not sure why this is a issue on the server but when we installed Mayavi on laptop it opens up fine. Thanks!
IMHO, you should stay on the latest stable driver (430) unless you have problems with it and want support from Dell. In that case, they’ll probably ask you to downgrade.
vGPU is for providing virtual gpus to virtual machines, i.e. that every user can have its own virtual workstation. I don’t know about your exact use cases/number of users/etc. but I suspect this would be a bit overblown since this also requires you to set up VMs for each user. If you’re looking for a commercial product, maybe take a look at NoMachine which has also support for virtualgl to make use of the tesla.
Neither vGPU nor NoMachine are click-and-run solutions, though.
The Redhat support engineer said the same that 430 is the current driver. I have attached my conversation from 8/12 where he said the same thing that you mentioned. I spoke to my Dell sales guy and he said that graphics rendering should work if users are connecting to the server locally and the end users are connecting using RDP.
What is driving me nuts is that end users are claiming mayavi rendered graphics with python2.7 but now that I upgraded it mayavi2 and python3.7 the OpenGL is not rendering. These end users updated some drivers and messed up the entire gui connectivity. Redhat didn’t help much and that’s when I posted my question on this forum and you helped me get the gui working again. RHSupport8-12.txt (3.02 KB)
I guess that by upgrading mayavi/vtk this raised the requirement for the opengl level, which the mesa software renderer doesn’t provide anymore. Like mentioned before, you can use overrides to (probably) make it run, use
MESA_GL_VERSION_OVERRIDE=3.3 mayavi2
to make it run using software gl.
The Dell rep’s claim to just put in a tesla and it’ll magically work is plain wrong. Especially xrdp will always need a virtualgl setup.
The RH support’s claims are more on-spot but outdated. It’s not that complicated either.
The question is why you need a gui locally, this is the only thing that’s complicating things as it’ll always run on matrox.
I don’t know about the gl lib layout of RHEL7, please post the output of
ls -l /usr/lib/libGL* /usr/lib64/libGL*
You are correct about upgrading mayavi/vtk raised the requirement opengl level and Mesa has deprecate support for Matrox. This command MESA_GL_VERSION_OVERRIDE=3.3 mayavi2 did open Mayavi fine. I’m waiting on the end user to confirm that they can plot.
and reboot. Afterwards, you’ll have no gui on the local monitor but it’s there, running in the nvidia video memory using a virtual monitor. Please check if the X server is running:
ps aux |grep X
and inspect /var/log/Xorg.0.log
Please post/attach both.
Check if you can still connect over xrdp.
If this is running, install and configure virtualgl. It’s in the epel repo you have already added, so just use yum to install.
Don’t know if the config is started automatically, if not, run
vglserver_config
just use the default values, then restart the display-manager or reboot.
Put your users into the vglusers group, connect over xrdp, open a terminal and run
glxgears
stop it, then run
vglrun glxgears
this should yield much higher fps.
to get some on-demand gui on the local monitor, create /usr/local/etc/xorg-matrox.conf