not able to update Tesla P100 driver 384 to 418

Sorry for the delay… Thanks for the steps but the users have said they can’t allow maintenance window this weekend for me to carry out these steps. I will have to wait till they give me the server to carry out these tasks. I really want to thank you for being extremely helpful with my issue. If not for your guidance I would never get the GUI back online. Once I get the chance to carry out the steps I will post the output but thanks a million for your timely help. Appreciate it very much.

I got a question… End users are saying Cuda is not installed. I thought I followed the steps
yum clean all
yum install cuda-drivers
reboot

but when I run ‘nvcc -V’ I get
bash: nvcc: command not found…

Also nvidia-smi shows the following.
Tue Sep 17 16:28:08 2019
±----------------------------------------------------------------------------+
| NVIDIA-SMI 430.40 Driver Version: 430.40 CUDA Version: 10.1 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P100-PCIE… Off | 00000000:03:00.0 Off | 0 |
| N/A 30C P0 24W / 250W | 4MiB / 16280MiB | 0% Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
±----------------------------------------------------------------------------+

Should I run the command from the nvidia site to install cuda?

https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=RHEL&target_version=7&target_type=rpmlocal

Please let me know. Thanks!

Do the first three steps but not the last (sudo yum -y install nvidia-driver-latest-dkms cuda), instead run
sudo yum install cuda-toolkit-10-1
otherwise you’ll kill your already installed driver.

Thanks for your prompt reply. I will schedule time to do it and post the result. Appreciate it very much.

I have a little different question and want your opinion. The users want to plot images (3D acceleration graphics) but are not able to plot. The hardware rendering use to work before with MayaVi but not anymore with MayaVi 2 because the support for Matrox card has been deprecated by the Mesa 3D graphics library. The server vendor, Dell, connected me to an NVIDIA rep who said that P100 will do graphics but it’s for VDI. The end users think that hardware is the problem and changing or putting an appropriate graphics card should take care of the rendering issue. I want to know if putting an additional graphics card will fix the hardware rendering issue and if so then which one should I go with. The NVIDIA asked me to let them know which card is compatible so they can send me the pricing. Thanks!

Putting in another graphics card won’t change the fact that you still need to set up virtualgl for your xrdp users to get hw accel. It just doesn’t work without it.

So I ran the first 3 commands as you said from below:

$ wget http://developer.download.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda-repo-rhel7-10-1-local-10.1.243-418.87.00-1.0-1.x86_64.rpm
$ sudo rpm -i cuda-repo-rhel7-10-1-local-10.1.243-418.87.00-1.0-1.x86_64.rpm
$ sudo yum clean all
$ sudo yum -y install nvidia-driver-latest-dkms cuda (but not this one)

and then sudo yum install cuda-toolkit-10-1

When I did issued $ nvidia-smi
Fri Sep 20 21:20:18 2019
±----------------------------------------------------------------------------+
| NVIDIA-SMI 430.40 Driver Version: 430.40 CUDA Version: 10.1 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P100-PCIE… Off | 00000000:03:00.0 Off | 0 |
| N/A 32C P0 24W / 250W | 4MiB / 16280MiB | 0% Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
±----------------------------------------------------------------------------+

Also, $ nvcc --version
bash: nvcc: command not found…
And
$ which nvcc
/usr/bin/which: no nvcc in (/usr/bin/anaconda2/bin:/usr/bin/anaconda2/condabin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/kiran/.local/bin:/home/kiran/bin)

Am I missing something?

Also, $ sudo rpm -i cuda-repo-rhel7-10-1-local-10.1.243-418.87.00-1.0-1.x86_64.rpm gives
package cuda-repo-rhel7-10-1-local-10.1.243-418.87.00-1.0-1.x86_64 is already installed

OK before I do the post installation steps suggested here: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#post-installation-actions, I want to confirm with you if I should do this on PATH export as well - :/usr/local/cuda-10.1/NsightCompute-2019.1${PATH:+:${PATH}}? I don’t see NsightCompute-2019 in /usr/local/cuda-10.1

$ls /usr/local/cuda-10.1/
bin extras lib64 libnvvp nsightee_plugins nvvm samples src tools
doc include libnsight LICENSE nvml README share targets version.txt

Please let me know. Thanks!

NsightCompute is a separate download, so there’s no need to add that path.

Thanks… That worked. Appreciate it.

I want to know if this is possible… Can the Tesla P100 be used for parallel computing and graphics simultaneously by setting another X server and installing VirtualGL?

Yes, with some limitations:
[url]USING CUDA AND X | NVIDIA

Thanks. The NVIDIA reps suggested to go ahead with another graphics card.

NVIDIA and Dell representatives recommend NVIDIA Quadro P4000 GPU card for graphics. I’m hoping this should be good to install Virtual GL and another X server.

Question… Once I get the card and install it, do you want me to follow the steps you mentioned in post # 40 or should the hardware rendering work with the new Quadro graphics card? Since there will be 2 graphical displays I want the Quadro configured to be used for XRDP and locally and disable Matrox if possible. Please let me know. Thanks!

I would not recommend it for the following reason:
virtualgl is using the Xserver at :0 for rendering. This is the GDM screen. As soon as you log in locally, it will spawn a second Xserver :1 for the user session, resulting in a vt switch so the Xserver at :0 will be inaccessible for virtualgl. So as soon as you log in locally, the rendering from xrdp will stop working.

Thanks. The server is accessed over XRDP by users exclusively and I access it locally during a maintenance window only after I reboot the server to make sure it is online.
I had posted previously the issue that Mayavi application after upgrade to version 2 does not support hardware rendering because of the native Matrox card being deprecated. So, we decided to test by installing Virtual GL on a laptop with Intel graphics card to see how to configure hardware rendering over XRDP and be prepared whenever we get the NVIDIA Quadro card.
After we installed Virtual GL on the laptop and tried to run mayavi2 issuing ‘vglrun mayavi2’ command we got segmentation fault core dump error (over XRDP). When we issue mayavi2 command we get the same update OpenGL driver error. We even tried uninstalling Virtual GL and get the same error when trying to launch Mayavi2. glxinfo shows Mayavi still uses llvm instead of the native Intel card for rendering. Locally Mayavi works fine but it doesn’t work over XRDP on the Intel graphics card laptop. What/where am I missing the point?

Is it not possible to achieve hardware rendering over XRDP? Do I have to use another way to get this application to work? Please let me know your suggestion/opinion.

For mayavi2 to work with virtualgl and the nvidia driver, min. virtualgl 2.6 is required: [url]https://github.com/VirtualGL/virtualgl/releases[/url]
Though I doubt that was the problem on your test install.

  • An Xserver has to be running on the notebook, the normal login suffices.
  • The user connecting over xrdp has to have access to it:
    When connected over xrdp, what’s the output of
    DISPLAY=:0 glxinfo
  • Does glxgears run?
    vglrun glxgears