Error when Installing CUDNN with CUDA already installed Ubuntu 18.04

I am attempting to install cuDNN to use with CUDA. CUDA is already installed when I run

nvcc --version

I receive:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85

But when I run

nvidia-smi

I get:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.50       Driver Version: 430.50       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 208...  Off  | 00000000:D5:00.0  On |                  N/A |
| 28%   36C    P8    15W / 250W |    885MiB / 10997MiB |      1%      Default |
+-------------------------------+----------------------+----------------------+

One says I am using Cuda 9.1 and the other says 10.1?

Anyways when trying to install cuDNN 10.1 using instructions from https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html, when I get to the

make clean && make

step of testing the cuDNN install I receive the following error:

rm -rf *o
rm -rf mnistCUDNN
Linking agains cublasLt = false
CUDA VERSION: 9010
TARGET ARCH: x86_64
HOST_ARCH: x86_64
TARGET OS: linux
SMS: 30 35 50 53 60 61 62 70
/bin/sh: 1: /usr/local/cuda/bin/nvcc: not found
>>> WARNING - FreeImage is not set up correctly. Please ensure FreeImage is set up correctly. <<<

So it looks like there is an issue with a root file in

/usr/local/cuda/bin/

. After checking the

/usr/local/cuda/

directory it only has ‘include’ and ‘lib64’ directories, was there a problem with CUDA install? Did a directory get deleted somewhere along the way?

The driver reports, via nvidia-smi, the latest CUDA version it supports: 10.1
The compiler reports what CUDA version it belongs to: 9.1

Generally speaking, newer drivers can support older CUDA versions. So a driver that supports versions up to and including 10.1 also supports CUDA 9.x and CUDA 8.x, for example. Example from the machine on which I am writing this post:

C:\Users\Norbert\My Programs>nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Mon_Jan__9_17:32:33_CST_2017
Cuda compilation tools, release 8.0, V8.0.60

C:\Users\Norbert\My Programs>nvidia-smi
Tue Oct 08 13:22:55 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 431.02       Driver Version: 431.02       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro P2000       WDDM  | 00000000:01:00.0  On |                  N/A |
| 82%   80C    P0    62W /  75W |   1030MiB /  5120MiB |     94%      Default |
+-------------------------------+----------------------+----------------------+

Thank you for the response.

I tried to reinstall cuDNN 9.0 and am still running into the same issue:

/bin/sh: 1: /usr/local/cuda/bin/nvcc: not found

It can’t find nvcc binaries but nvcc commands still work. When checking cuda folder it does not contain bin at all.

Any ideas?

Since I give Ubuntu the widest possible berth, I am the wrong person to answer that question.

There is no cuDNN 9.0, currently. 7.6 is the current version

It’s possible you installed CUDA using a non-NVIDIA method, in which case it may not be where the NVIDIA tools expect it to be (at /usr/local/cuda)

what is the result of running:

which nvcc

at a command line?

Hi I was able to fix the make issue but manually copying the nvcc file to the cuda folder. Now when I compile and test I get the following output:

Testing single precision
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm ...
Fastest algorithm is Algo 0
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.010688 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.031968 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.040896 time requiring 3464 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.051840 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.069632 time requiring 207360 memory
Cublas failure
Error code 0
gemv.h:77
Aborting...

I’ve tried with all 4 cuDNN download options and have also tried two archived releases as well and they all report the same error.

I’ve seen other posts on the error but no good solutions. I am running a GeForce 2080 Ti, with driver 430.50 and Cuda 9.1.85. Is the issue the version of Cuda, does it need to be 10.0 or greater?

Thanks

That’s definitely not what I would do. If you want to do your own thing, I won’t be able to help you.

Yes, if it were me I would certainly be using CUDA 10.0 or higher on any Turing GPU.

Note that there is a dedicated forum area for asking CUDNN-specific questions.

https://devtalk.nvidia.com/default/board/305/cudnn/

So would you recommend an uninstall of Cuda 9.0 and then replace with Cuda 10.0?

As for your previous question on ‘which nvcc’ it was located in /usr/bin/

Also, I did not install Cuda on this machine, it was installed by a previous user and it seems to have installed 10.0 and 10.1 but is running 9.1.

I already indicated if it were me, I would switch to CUDA 10.0:

In fact, if it were me, I would do a fresh load of the OS and start with a clean slate.

Thats a result of a ubuntu-provided install method, not a NVIDIA-provided install method. When you use the CUDNN package, it expects you to use a NVIDIA-provided proper install of CUDA.

Ok thank you for the help, I feel that fresh load of the OS would probably be best.