CUDA 10 installation problems on Ubuntu 18.04

vitaliy.gyrya · December 17, 2018, 6:43pm

Installed NVidia drivers for Quadro M500M as follows:

sudo add-apt-repository ppa:graphics-drivers/ppa
  sudo apt-get update
  sudo apt install nvidia-390
  restart

Checked the installation by

nvidia-smi

and got:

Mon Dec 17 10:13:49 2018       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.87                 Driver Version: 390.87                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro M500M        Off  | 00000000:06:00.0 Off |                  N/A |
| N/A   46C    P0    N/A /  N/A |    688MiB /  2004MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1305      G   /usr/lib/xorg/Xorg                           359MiB |
|    0      1496      G   /usr/bin/gnome-shell                         166MiB |
|    0      1984      G   ...uest-channel-token=18290818570900219022   160MiB |
+-----------------------------------------------------------------------------+

So everything seams fine with drivers.
Just in case restarted the system.

Followed instructions for CUDA installation from https://www.tensorflow.org/install/gpu.
Selected:

Linux > x86_64 > Ubuntu > 18.04 > deb (local)

and followed the instructions listed underneath:

sudo dpkg -i cuda-repo-ubuntu1804-10-0-local-10.0.130-410.48_1.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-10-0-local-10.0.130-410.48/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda-libraries-10-0

Instead of the 4th line above I have also tried

sudo apt-get install cuda

Running

nvcc --version

did not give me info about the version of CUDA:

Command 'nvcc' not found, but can be installed with:

sudo apt install nvidia-cuda-toolkit

So I tried

sudo apt install nvidia-cuda-toolkit
nvcc --version

only to get

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85

[b]

So, why is CUDA-10.0 not detected? I thought I was following instructions to the letter
Why is there a different method of installing CUDA that is not listed on the CUDA web page and why does it install an older version of CUDA?
What is a good WORKING method for installing CUDA 10.0?

[/b]

Robert_Crovella · December 17, 2018, 7:02pm

follow the instructions in the linux install guide: [url]Installation Guide Linux :: CUDA Toolkit Documentation

get your installers from [url]http://www.nvidia.com/getcuda[/url]

and you won’t be able to use that ppa driver nvidia-390 with CUDA 10. Use the driver bundled with the CUDA 10 installers instead.

Now that you’ve already installed the wrong drivers, read the linux install guide carefully. Failure to follow it carefully will result in more trouble.

vitaliy.gyrya · December 17, 2018, 8:03pm

Ok, assuming I did something wrong before. Here are my steps, skipping verification steps:

My ‘/usr/local/’ contains ‘cuda’ and ‘cuda-10.0’.
Uninstalling a Toolkit runfile installation:

sudo /usr/local/cuda-10.0/bin/uninstall_cuda_10.0.pl

results in

command not found

There is no ‘uninstall_cuda*’ in ‘/usr/local/cuda-10.0/bin/’.
Also there is no ‘uninstall_cuda*’ in ‘/usr/local/cuda/bin/’

Continuing with the next line:

sudo /usr/bin/nvidia-uninstall

Same result. No file.

Continuing with the next line:

sudo apt-get --purge remove cuda

I got

... Removing cuda (10.0.130-1) ...

Looking in ‘/usr/local/’ I still see non-empty ‘cuda’ and ‘cuda-10.0’.

Doing

nvcc --version

I get

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85

So, did I successfully uninstall a Toolkit runfile installation?

This does not make sense to me. Are ‘Cuda compilation tools’ the same as ‘Cuda toolkit’?
Based on the messages the system detected version 10.0 when it was time to uninstall it, but it does not see version 9.1 as it is still there. What’s going on? What am I missing?

Robert_Crovella · December 17, 2018, 8:07pm

You didn’t perform a runfile installation to begin with. Please re-read the install guide. This:

sudo apt-get install cuda

is not a runfile installation.

Robert_Crovella · December 17, 2018, 8:16pm

You’ll need to get the nvidia-390 driver package off your system. Since those instructions didn’t come from NVIDIA, and the driver was not bundled by NVIDIA, but instead by a 3rd-party, you may need to check elsewhere for removal instructions. But something like:

sudo apt-get --purge remove nvidia-390

should work, I think.

Then reinstall cuda:

sudo dpkg -i cuda-repo-ubuntu1804-10-0-local-10.0.130-410.48_1.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-10-0-local-10.0.130-410.48/7fa2af80.pu
sudo apt-get update
sudo apt-get install cuda

That should install the 410.48 driver for you. Verify that it was successful after a reboot with

nvidia-smi

The reported driver version should be 410.48

If it is not, your system was not properly cleaned up. I won’t be able to guide you through other clean up steps. A generally working option is to re-install the OS.

If the driver is installed correctly, cuda should be installed as well.

Now follow steps in section 7 of the linux install guide to:

perform the mandatory post-install steps handling PATH and LD_LIBRARY_PATH
verify the cuda install by building and running a few sample codes, such as deviceQuery and vectorAdd

vitaliy.gyrya · December 17, 2018, 9:05pm

I did

sudo apt-get purge nvidia*

Now

nvcc --version

returns

Command 'nvcc' not found, but can be installed with:
sudo apt install nvidia-cuda-toolkit

Then followed your directions:

sudo dpkg -i cuda-repo-ubuntu1804-10-0-local-10.0.130-410.48_1.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-10-0-local-10.0.130-410.48/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda

Restarted. Checking

nvidia-smi

I get

Mon Dec 17 13:44:11 2018       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.78       Driver Version: 410.78       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro M500M        Off  | 00000000:06:00.0 Off |                  N/A |
| N/A   48C    P0    N/A /  N/A |    386MiB /  2004MiB |      6%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1270      G   /usr/lib/xorg/Xorg                           230MiB |
|    0      1492      G   /usr/bin/gnome-shell                         112MiB |
|    0      1883      G   ...uest-channel-token=15365897714315109728    41MiB |
+-----------------------------------------------------------------------------+

So, all seams good for now.

Following steps starting from 7:

export PATH=/usr/local/cuda-10.0/bin${PATH:+:${PATH}}

Verifying driver versions

cat /proc/driver/nvidia/version

I am getting

NVRM version: NVIDIA UNIX x86_64 Kernel Module  410.78  Sat Nov 10 22:09:04 CST 2018
GCC version:  gcc version 7.3.0 (Ubuntu 7.3.0-27ubuntu1~18.04)

nvcc --version

I am getting

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130

So the problem I had before was solved already.

Now trying to do “7.2.3.2. Compiling the Examples” where I ran into issues.
What is this path

~/NVIDIA_CUDA-10.0_Samples?

It does not exist. Instead I see

/usr/local/cuda-10.0/samples

There is a “Makefile” there. But when I do

make

I get an error:

make[1]: Entering directory '/usr/local/cuda-10.0/samples/0_Simple/fp16ScalarProduct'
/usr/local/cuda-10.0/bin/nvcc -ccbin g++ -I../../common/inc  -m64    -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_75,code=compute_75 -o fp16ScalarProduct.o -c fp16ScalarProduct.cu
Assembler messages:
Fatal error: can't create fp16ScalarProduct.o: Permission denied
Makefile:288: recipe for target 'fp16ScalarProduct.o' failed
make[1]: *** [fp16ScalarProduct.o] Error 1
make[1]: Leaving directory '/usr/local/cuda-10.0/samples/0_Simple/fp16ScalarProduct'
Makefile:51: recipe for target '0_Simple/fp16ScalarProduct/Makefile.ph_build' failed
make: *** [0_Simple/fp16ScalarProduct/Makefile.ph_build] Error 2

Am I running the right samples? What does this mean?

Robert_Crovella · December 17, 2018, 9:09pm

you don’t have write access to the directories where those sample codes are located.

do something like:

sudo make -k

vitaliy.gyrya · December 17, 2018, 9:23pm

This worked:

sudo make -k

It did not finish yet, but it has been doing something for the last five minutes plus. So I assume all is well now. Thank you for your help! Very much appreciate it.

Robert_Crovella · December 17, 2018, 9:29pm

Yes, it takes a while to build all the sample codes.

byvalentino · March 14, 2019, 11:11am

CUDA 10 should be installed. However, no sign within /user/local/cuda where I find 2 cuda-9 version.
What happened?

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 415.18       Driver Version: 415.18       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 208...  Off  | 00000000:08:00.0 Off |                  N/A |
| 35%   39C    P2    58W / 260W |    613MiB / 10989MiB |     15%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce RTX 208...  Off  | 00000000:41:00.0 Off |                  N/A |
| 35%   34C    P8     5W / 260W |    579MiB / 10986MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      7082      C   ...ce/embeddings/embeddingsenv/bin/python3   325MiB |
|    0     10137      C   ...ce/embeddings/embeddingsenv/bin/python3   277MiB |
|    1      1486      G   /usr/lib/xorg/Xorg                            16MiB |
|    1      7082      C   ...ce/embeddings/embeddingsenv/bin/python3   275MiB |
|    1     10137      C   ...ce/embeddings/embeddingsenv/bin/python3   275MiB |
+-----------------------------------------------------------------------------+

Robert_Crovella · March 14, 2019, 2:05pm

I don’t know what “2 cuda-9” means

Do you mean previously you had cuda 9 installed and you want to know where it is now?

Look in /usr/local, not /usr/local/cuda

byvalentino · March 14, 2019, 3:57pm

Sorry for my poor explanation.
In the local folder I have 2 cuda folders.

:/usr/local$ ls
bin  cuda  cuda-9.0  etc  games  include  lib  man  sbin  share  src

Within each folder, according to version.txt I have the same version of cuda

CUDA Version 9.0.176

My question are:

where is cuda-10?
why is in the kernel but I can’t find it anywhere?
Is there a way of finding it without reinstalling everything?

Thanks upfront for the help!

Robert_Crovella · March 14, 2019, 4:00pm

Did you install CUDA 10?

It looks to me like you simply haven’t installed CUDA 10. You have an updated GPU driver (415.18). However, the fact that nvidia-smi indicates: CUDA Version: 10.0 doesn’t actually mean you have CUDA 10 installed.

byvalentino · March 19, 2019, 4:13am

I checked the Pre-installation Actions (Installation Guide Linux :: CUDA Toolkit Documentation)
I don’t understand how to use the 2.7. Handle Conflicting Installation Methods:

CUDA Version 9.0.176
NVIDIA-SMI 415.18 Driver Version: 415.18 CUDA Version: 10.0
Is there any conflict?

I performed the 3.6. Ubuntu from 1 to 4.

If I hit 5 (sudo apt-get install cuda) will I keep both CUDA 9.0 and 10.1 or 9.0 will disappear?
since the driver is already up to date, should I hit “sudo apt-get install cuda-toolkit-10-1” instead?

Thanks again for your help!

zokiperic · June 13, 2019, 12:56am

I followed your instructions and have found that they installed driver version 10.

NVIDIA-SMI 410.104      Driver Version: 410.104      CUDA Version: 10.0

And the minimum driver cuda 10 is 410.42.

Now I do not even know how to remove these and you just wasted a few hours.

May I know why are you ship wrong version of drivers with Cuda?

Robert_Crovella · June 13, 2019, 1:18am

410.104 is a newer driver than 410.42. So it satisfies the minimum driver requirement for CUDA 10.

104 > 42

harshshah510 · June 13, 2019, 9:27am

I did a fresh install of ubuntu after trying to install Cuda and Cudnn yesterday. I installed the Nvidia 410 driver (ONLY THE DRIVER AND NOTHING ELSE) and rebooted my system.

The problem I have is that if I run “nvidia-smi” in terminal , it shows “Cuda : 10.0” even though I have never installed Cuda. Now should I still follow your guide to install CUDA on linux - because last time I did that ,I ended up with multiple CUDA versions on my system and had to reinstall ubuntu.

Nvidia Driver version : 410.104
nvidia-smi command : Working
OS : Ubuntu 18.04

PS : When I install nvidia-driver-396 , it doesnt install CUDA on it’s own… But all the versions above 410 install CUDA on it’s own (OR atleast thats what it shows in nvidia-smi command)

Any help would be appreciated.

Robert_Crovella · June 13, 2019, 12:03pm

The CUDA version shown in nvidia-smi command (on newer driver versions) does not indicate that CUDA is actually installed, or what version of CUDA is installed. It indicates what is the highest version of CUDA that the driver is compatible with.

suryadi · June 28, 2019, 4:42pm

Dear NVIDIA,

My nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.26       Driver Version: 430.26       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    Off  | 00000000:01:00.0  On |                  N/A |
| 35%   53C    P2    41W / 180W |   1627MiB /  8118MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1119      G   /usr/lib/xorg/Xorg                           863MiB |
|    0      1259      G   /usr/bin/gnome-shell                         395MiB |
|    0      2642      G   ...equest-channel-token=643813161121753532   256MiB |
|    0      3693      C   /usr/lib/libreoffice/program/soffice.bin     105MiB |
|    0      7966      G   gnome-control-center                           2MiB |
+-----------------------------------------------------------------------------+

my nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85

I need to have CUDA 10.0.
Would you mind to add cuda_10.0.130_410.48_linux.run at https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ ?

Is there any work around to have CUDA higher than 9.1?

Thank you very much in advance.

Warmest Regards,
Suryadi

Robert_Crovella · June 28, 2019, 4:49pm

if you use the package manager install method, just do:

sudo apt-get install cuda-toolkit-10-0

if you use the runfile install method, download the runfile installer you already mentioned (cuda_10.0.130_410.48_linux.run), run it, and select no when prompted to install the driver (your 430.xx driver is fine for all this).

If you have no idea what any of this means, please read the linux install guide:

[url]Installation Guide Linux :: CUDA Toolkit Documentation

You can get older installers here:

[url]https://www.nvidia.com/getcuda[/url]

Use the legacy release button to access the toolkit installer archive. You’ll note that older documentation versions are available there online as well.

Topic		Replies	Views
CUDA 4.2 Install in Ubuntu 12.04 CUDA Programming and Performance	12	19857	August 25, 2017
Followed guide NVIDIA CUDA Installation Guide for Linux, failing at driver install CUDA Setup and Installation cuda , ubuntu	1	1503	October 27, 2020
"NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver" Ubuntu 16.04 CUDA Setup and Installation	79	371340	March 19, 2021
Install CUDA-9 on Ubuntu 16.04 with the runfile and pre-installed drivers CUDA Setup and Installation	15	58567	February 28, 2020
Problems with CUDA 9.1 in Ubuntu 16.04 CUDA Setup and Installation	36	24289	May 15, 2018
CUDA install unmet dependencies: cuda : Depends: cuda-10-0 (>= 10.0.130) but it is not going to be installed CUDA Setup and Installation	37	181602	September 17, 2023
[INFO]: Finished with code: 256 , [ERROR]: Install of driver component failed CUDA Setup and Installation	24	175823	September 29, 2024
CUDA, Linux Ubuntu 10.04 and strange mismatch version CUDA Programming and Performance	26	19083	November 18, 2010
CUDA working on ubuntu-desktop not on ubuntu-server CUDA Programming and Performance	21	19099	March 13, 2014
Cuda support for legacy GPUs CUDA Setup and Installation	14	8214	November 29, 2016

CUDA 10 installation problems on Ubuntu 18.04

Related topics