Nvidia-smi fails on various google cloud VM's with Tesla K80 GPU

user55124 · November 18, 2021, 5:02pm

I have followed the guide on Google Cloud using Ubuntu 18 and 20 ( have also tried Ubuntu Lite , Debian and Centos 7 ):

Unfortunately, after completing the lengthy install I get this:

me@gpu:~$ nvidia-smi NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running

Have tried installing via the script and via the direct downloads from Nvidia site for Cuda 10.

I have also tried some of things recommended here with no luck:

generix · November 19, 2021, 8:44am

Please run nvidia-bug-report.sh as root and attach the resulting nvidia-bug-report.log.gz file to your post.

user55124 · November 19, 2021, 5:37pm

I was able to get it working. The mistake I was making was not doing the pre-installation steps before running the cuda_10.1.243_418.87.00_linux.run script. I was under the impression the *.run file would do everything for me. It would help if users were told they MUST do the pre-installation steps. Specifically I had to do this for Ubuntu 18:

sudo nano /etc/modprobe.d/blacklist-nouveau.conf
blacklist nouveau
options nouveau modeset=0
sudo update-initramfs -u
reboot

This seems like a bit of a “hack”, so not sure why nvidia can’t make the installation process more robust? They make a bazillion of these cards. It’s not like some homemade product with a niche user base…

system · December 3, 2021, 5:37pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
NVIDIA-SMI can't communicate with NVIDIA driver Linux	1	3984	March 10, 2022
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver Linux cuda	0	1466	July 1, 2020
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver Linux cuda , ubuntu	4	2134	May 4, 2021
Unable to install Tesla V100 GPU drivers GPU - Hardware	1	1548	January 30, 2020
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver Linux	4	833	October 12, 2021
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Ubuntu Tesla T4 Linux kernel , ubuntu , linux , gpu , installation , driver , nvidia-smi , problem , linux-driver	0	1137	July 9, 2023
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running CUDA Setup and Installation cuda , nvidia-smi	0	986	December 2, 2022
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running. Linux	2	5817	August 16, 2019
NVIDIA-SMI has failed in Ubuntu 18.04 Linux	6	12981	December 2, 2020
Ubuntu 20.04 - NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver Linux	0	678	December 28, 2022

Nvidia-smi fails on various google cloud VM's with Tesla K80 GPU

Related topics