I’ve installed CUDA 8.0 through the runfile for Ubuntu 16.04 but I can’t get my code that works on my other machine to run. When I try nvidia-smi I get: “NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.”
rc nvidia-304 304.134-0ubuntu0.16.04.1 amd64 NVIDIA legacy binary driver - version 304.134
ii nvidia-367 375.39-0ubuntu0.16.04.1 amd64 Transitional package for nvidia-375
ii nvidia-375 375.39-0ubuntu0.16.04.1 amd64 NVIDIA binary driver - version 375.39
ii nvidia-common 1:0.4.17.2 amd64 transitional package for ubuntu-drivers-common
rc nvidia-cuda-toolkit 7.5.18-0ubuntu1 amd64 NVIDIA CUDA development toolkit
rc nvidia-opencl-icd-304 304.134-0ubuntu0.16.04.1 amd64 NVIDIA OpenCL ICD
ii nvidia-opencl-icd-375 375.39-0ubuntu0.16.04.1 amd64 NVIDIA OpenCL ICD
ii nvidia-prime 0.8.2 amd64 Tools to enable NVIDIA's Prime
ii nvidia-settings 378.13-0ubuntu0~gpu16.10.2 amd64 Tool for configuring the NVIDIA graphics driver
I’ve tried the deb file install, adding the PPA repo and using that, rebooted my machine, but nothing seems to work. Can someone help me?
I wasn’t able to resolve the issue, but I was able to determine the cause. I’m using a passthrough VM, and the GPU I use apparently isn’t supported with Ubuntu under passthrough VMs.
I have set the PATH. But nvidia-smi gives error :“NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.”
ii nvidia-387 387.26-0ubuntu1 amd64 NVIDIA binary driver - version 387.26
ii nvidia-387-dev 387.26-0ubuntu1 amd64 NVIDIA binary Xorg driver development files
ii nvidia-modprobe 387.26-0ubuntu1 amd64 Load the NVIDIA kernel driver and create device files
ii nvidia-opencl-icd-387 387.26-0ubuntu1 amd64 NVIDIA OpenCL ICD
ii nvidia-prime 0.8.2 amd64 Tools to enable NVIDIA's Prime
ii nvidia-settings 387.26-0ubuntu1 amd64 Tool for configuring the NVIDIA graphics driver
Same here, today update to last cuda (9.1.85-1) with driver 387.26-1 and don’t work
Kernel:
Linux pop-01 4.13.0-26-generic #29~16.04.2-Ubuntu SMP Tue Jan 9 22:00:44 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
i test in a clean instalation and don’t work. reboot, etc and only have “NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.”
$ sudo apt-get install cuda
Reading package lists… Done
Building dependency tree
Reading state information… Done
cuda is already the newest version (9.1.85-1).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
$ sudo apt-get install cuda-drivers
Reading package lists… Done
Building dependency tree
Reading state information… Done
cuda-drivers is already the newest version (387.26-1).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
in a new system or old one the error is the same
$ dpkg -l | grep nvidia
ii nvidia-387 387.26-0ubuntu1 amd64 NVIDIA binary driver - version 387.26
ii nvidia-387-dev 387.26-0ubuntu1 amd64 NVIDIA binary Xorg driver development files
ii nvidia-modprobe 387.26-0ubuntu1 amd64 Load the NVIDIA kernel driver and create device files
ii nvidia-opencl-icd-387 387.26-0ubuntu1 amd64 NVIDIA OpenCL ICD
ii nvidia-prime 0.8.2 amd64 Tools to enable NVIDIA’s Prime
ii nvidia-settings 387.26-0ubuntu1 amd64 Tool for configuring the NVIDIA graphics driver
I did an update on my Ubuntu 16.04 test machine today and got the same messages. I wiped the partitions and did a fresh install with Ubuntu 17.10. I went to software updates and changded the graphics driver to nvidia 384.11. No other drivers or packages installed. Opened a terminal window:
ii nvidia-384 384.111-0ubuntu0.17.10.1 amd64 NVIDIA binary driver - version 384.111
ii nvidia-opencl-icd-384 384.111-0ubuntu0.17.10.1 amd64 NVIDIA OpenCL ICD
ii nvidia-prime 0.8.5 amd64 Tools to enable NVIDIA's Prime
ii nvidia-settings 384.69-0ubuntu1 amd64 Tool for configuring the NVIDIA graphics driver
nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
Fresh install. The GUI appears to work, but I need to program in cuda. Not sure if going back to 16.4 will help. Have no idea what to try next. I am very interested in a solution.
I discover one workaround to this problem: use an old Kernel version.
With my Ubuntu 16.04.2 i make and update and see a cuda and Kernel update.
When i reboot the machine i see the “NVIDIA-SMI has failed because it couldn’t…” message, I try in a new installation, old one and see the main diference was the Kernel version.
On Kernel 4.13.0-26 all the NVIDIA don’t recognize the Cards. In my case 8 x 1070 (for mining purposes). When i use a previous version (4.10.0-42) and re-install cuda 9.1.85-1, the machine work as usual.
I think the Cuda driver has a problem with the new Kernel.
if you found an old one, only mark the versión to use editing /etc/defaults/grub, putting something like this:
GRUB_DEFAULT=“1>3”
This value change depending on your installation and updates. Check Grub2 - Community Help Wiki, In my case grub.cfg has a submenu in the second part and the kernel to load was the 4 section. Remeber: In grub the sections start on zero (0) that’s way i put the second section like “1” and the 4 section like “3”: “1>3”.
You could use an utility called “grub Customizer”.
Be aware: If you don’t make this kind of changes carefully, you can lost access to your server.
Guys …I am using Google Cloud instance which is being charged at $2.5 / hour…cannot have entire set up done again on different instance…Any update on the issue ?
Hi, I have a relatively simple configuration with only one Tesla K80 GPU device attached to my VM on Google Cloud Platform. I use the below mentioned script to install CUDA drivers using root privileges on Ubuntu 16.04.
#!/bin/bash
echo “Checking for CUDA and installing.”
I was facing the same issue as well on Google Cloud Platform. I have been running with no issues for the last month. Now today I run nvidia-smi and get the same error message “NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.”
I originally installed using the same process as vibhor6wvke
I was able to solve using the link from ElPop by rebooting to a previous version of the kernel.
Additionally, this link was helpful for booting from a new kernel on a virtual machine.
I am also facing the problem above. When I installed cuda, nvidia-387 is installed automatically. But nvidia-387 couldn’t communicate with my gtx 1080.
Instead, nvidia-384 manually installed work with nvidia-smi.
Your 6-year-old Dell may be running a different kernel. It looks like the Ubuntu 16.04 HWE kernel was recently upgraded from 4.10 to 4.13, but the Ubuntu 16.04 regular kernel is still at 4.4. So, if your new Dell is running HWE and your old one isn’t, that would explain it.
i’m having the same problem on google cloud with ubuntu 16.04 and tesla k80.
the fix that was proposed by ElPop does not work for me because even after downloading and installing different kernel and changing GRUB config according to the instructions he gave, the system reboots to the same kernel.
any one have any idea why?
sudo vim /etc/default/grub
sudo grub-set-default "GNU/Linux, with Linux 4.10.0-041000-generic"
sudo grub-reboot "GNU/Linux, with Linux 4.10.0-041000-generic"
sudo update-grub
sudo reboot
@omer.stein1 for me it didn’t work as well and then I ended up re-creating the whole Google Compute Engine. But after setting it all up, it still didn’t work. So for me the problem was the driver of the P100 GPU on Google Cloud.
After installing the latest driver (nvidia-390) following this guide, it finally works again:
Also this might be helpful, if the driver update doesn’t fix it for you: