Recommended drivers install process for Dell R740 Poweredge Server with a Nvidia Tesla P100 GPU with only SSH access

CuriousVisitor · March 21, 2019, 4:36pm

Hi

I have only a ssh access to a Dell R740 Poweredge Server with a Nvidia Tesla P100 GPU and Ubuntu 18.04.

I wanted to know what was the recommended process to install mostly the drivers and toolkits necessary for Tensorflow without the risk of a login loop or a black screen. (It already happened once with the one from the Ubuntu repo, ssh was not responding, the IT department reinstalled everything so no nvidia-bug-report available. And since I do not have direct access it is not an easy problem to solve without asking IT to do the whole process again).

I have seen some people having difficulties like in here : [url]https://devtalk.nvidia.com/default/topic/1046157/linux/ubuntu-16-04-gui-login-loop-after-installing-nvidia-driver/1[/url]

Thank you to anyone kind enough and taking the time to help me,

generix · March 21, 2019, 5:08pm

It might be questionable why there’s a gui running on a probably headless server and why installing a graphics driver keeps sshd from starting but to have a least intrusive install,

download the 418.56 .run installer from here: [url]https://http.download.nvidia.com/XFree86/Linux-x86_64/[/url]
run it with options --dkms --no-opengl-files --no-x-check -Z
reboot
download cuda 10.0 .deb
do first three steps of install instructions
don’t install cuda
instead, run sudo apt install cuda-toolkit-10-0

CuriousVisitor · March 22, 2019, 8:44am

Thank you very much.
When you say “do first three steps of install instructions” where can I find this instruction, do you mean the first three items of your instructions or another entirely ?

generix · March 22, 2019, 9:15am

Sorry, forgot three words:

do first three steps of install instructions on download page

generix · March 22, 2019, 9:19am

Addendum: according to other user’s problems, Tensorflow seems to be linked against cuda 10.0 so it doesn’t work with cuda 10.1. Download the 10.0 .deb (network) from archives:
[url]CUDA Toolkit 10.0 Archive | NVIDIA Developer

CuriousVisitor · March 22, 2019, 9:29am

Thank you very much for your time, I will try that as soon as I can.

CuriousVisitor · March 25, 2019, 3:54pm

Thank you very much it worked for me but I had to remove the nouveau driver and purge them first

Topic		Replies	Views
Trouble downloading CUDA Toolkit 9.0 - Ubuntu 18.04. 'Driver: Not selected' Linux	11	5255	October 12, 2021
Cuda driver version is insufficient for cuda runtime version CUDA Setup and Installation	2	926	May 26, 2019
Installing Cuda toolkit 10.0 on Ubuntu 18 results in black boot screen CUDA Setup and Installation	5	7495	May 25, 2020
How install tensorflow with GPU DRIVE Hardware	1	2335	April 21, 2018
[Solved] Tensorflow 1.14 - Cuda 10.0 - GTX 970 - Ubuntu 18.04 CUDA Setup and Installation cuda , tensorflow , ubuntu	0	2669	January 27, 2021
Dell R820 + P1000 (times two) CUDA headless install Linux	4	2016	October 12, 2021
Installing Drivers and CUDA CUDA Setup and Installation	0	417	April 1, 2020
Tensorflow with NVIDIA Corporation GK107GL [GRID K1] CUDA-GDB	1	1752	August 13, 2021
Tensorflow import error "Couldn't open CUDA library libcuda.so.1" Ubuntu 14.04 Cuda 8.0 Dell 7559 i7 CUDA Setup and Installation	5	17292	November 18, 2016
[Solved] TensorFlow with GPU in Anaconda env [Ubuntu 16.04 + CUDA 7.5 + cuDNN] CUDA Setup and Installation	2	44639	May 24, 2016

Recommended drivers install process for Dell R740 Poweredge Server with a Nvidia Tesla P100 GPU with only SSH access

Related topics