Is Nvidia driver 410.57 incompatible with cuda 9.0

GeForce RTX 2080 need driver 410 to work properly, but when I installed cuda9.0 after driver 410 has been installed, I found cuda9.0 remove driver 410 and installed driver 384. After removing driver 384 and installing 410 again, I found the cuda sample can’t work and report this cublas error:

cudnnGetVersion() : 7300 , CUDNN_VERSION from cudnn.h : 7300 (7.3.0)
Host compiler version : GCC 5.4.0
There are 4 CUDA capable devices on your machine :
device 0 : sms 46 Capabilities 7.5, SmClock 1815.0 Mhz, MemSize (Mb) 7952, MemClock 7000.0 Mhz, Ecc=0, boardGroupID=0
device 1 : sms 46 Capabilities 7.5, SmClock 1815.0 Mhz, MemSize (Mb) 7952, MemClock 7000.0 Mhz, Ecc=0, boardGroupID=1
device 2 : sms 46 Capabilities 7.5, SmClock 1815.0 Mhz, MemSize (Mb) 7952, MemClock 7000.0 Mhz, Ecc=0, boardGroupID=2
device 3 : sms 46 Capabilities 7.5, SmClock 1815.0 Mhz, MemSize (Mb) 7952, MemClock 7000.0 Mhz, Ecc=0, boardGroupID=3
Using device 0

Testing single precision
Loading image data/one_28x28.pgm
Performing forward propagation …
Testing cudnnGetConvolutionForwardAlgorithm …
Fastest algorithm is Algo 0
Testing cudnnFindConvolutionForwardAlgorithm …
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.008576 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.047104 time requiring 3464 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.065408 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.070208 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.082080 time requiring 207360 memory
Cublas failure
Error code 0
gemv.h:77
Aborting…

It seems like cuda9.0 can’t work with driver 410, is this a bug report?
driver 410 work well with cuda10.0, however some library depend on cuda9.0, and I can’t give up them easily.

Driver 410 is compatible with cuda 9.0 but you mustn’t install cuda, instead install cuda-toolkit-9-0 after you installed the 410 driver.

How can I install cuda-toolkit-9-0? I download cuda9.0 from this web:
https://developer.nvidia.com/cuda-90-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1604&target_type=deblocal
And it says what I download is cuda-toolkit-9-0

Thanks for your help, I installed cuda-toolkit-9-0 by
apt-get install cuda-toolkit-9-0
However the sample still can’t work with error:

cudnnGetVersion() : 7300 , CUDNN_VERSION from cudnn.h : 7300 (7.3.0)
Host compiler version : GCC 5.4.0
There are 4 CUDA capable devices on your machine :
device 0 : sms 46 Capabilities 7.5, SmClock 1815.0 Mhz, MemSize (Mb) 7952, MemClock 7000.0 Mhz, Ecc=0, boardGroupID=0
device 1 : sms 46 Capabilities 7.5, SmClock 1815.0 Mhz, MemSize (Mb) 7952, MemClock 7000.0 Mhz, Ecc=0, boardGroupID=1
device 2 : sms 46 Capabilities 7.5, SmClock 1815.0 Mhz, MemSize (Mb) 7952, MemClock 7000.0 Mhz, Ecc=0, boardGroupID=2
device 3 : sms 46 Capabilities 7.5, SmClock 1815.0 Mhz, MemSize (Mb) 7952, MemClock 7000.0 Mhz, Ecc=0, boardGroupID=3
Using device 0

Testing single precision
Loading image data/one_28x28.pgm
Performing forward propagation …
Testing cudnnGetConvolutionForwardAlgorithm …
Fastest algorithm is Algo 0
Testing cudnnFindConvolutionForwardAlgorithm …
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.012288 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.110752 time requiring 3464 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.152864 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.304352 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.314144 time requiring 203008 memory
Cublas failure
Error code 0
gemv.h:77
Aborting…

And the cuda sample can’t be compiled successfully with error:


/usr/bin/ld: cannot find -lGL
collect2: error: ld returned 1 exit status
Makefile:293: recipe for target ‘marchingCubes’ failed
make[1]: *** [marchingCubes] Error 1
make[1]: Leaving directory ‘/home/tusimple/PerformanceTest/samples/2_Graphics/marchingCubes’
Makefile:52: recipe for target ‘2_Graphics/marchingCubes/Makefile.ph_build’ failed
make: *** [2_Graphics/marchingCubes/Makefile.ph_build] Error 2

I just checked, cuda 9.0, cudnn 7.3.0, driver 410.57, gcc 6.4.0, everything works, mnistCUDNN test passed. Not using ubuntu,though. Did you recompile the sample?
Otherwise, it would point to a packaging error, how did you install the driver?
/usr/bin/ld: cannot find -lGL
looks like missing symlinks for /usr/lib/libGL.so or /usr/lib/libGL.so.1
What’s the output of
ls -l /usr/lib/libGL*
?

Yes, cuda 9.0, cudnn 7.3.0, driver 410.57, gcc 6.4.0, everything works on 1080 Ti. But if you use GeForce RTX 2080 and you will find wrong result.
I have tried some test: (nvcr.io/nvidia/mxnet:18.08-py2 is the docker image provided by nvidia)

  1. Install nvidia-driver 410.57
  2. Download the docker image nvcr.io/nvidia/mxnet:18.08-py2
  3. Uninstall the cudnn7.2 in the container
  4. Install the cudnn7.3 in the container
  5. Run mnistCUDNN sample
    When do those steps on 1080 Ti everything work, but get error on 2080:

cudnnGetVersion() : 7300 , CUDNN_VERSION from cudnn.h : 7300 (7.3.0)
Host compiler version : GCC 5.4.0
There are 4 CUDA capable devices on your machine :
device 0 : sms 46 Capabilities 7.5, SmClock 1815.0 Mhz, MemSize (Mb) 7952, MemClock 7000.0 Mhz, Ecc=0, boardGroupID=0
device 1 : sms 46 Capabilities 7.5, SmClock 1815.0 Mhz, MemSize (Mb) 7952, MemClock 7000.0 Mhz, Ecc=0, boardGroupID=1
device 2 : sms 46 Capabilities 7.5, SmClock 1815.0 Mhz, MemSize (Mb) 7952, MemClock 7000.0 Mhz, Ecc=0, boardGroupID=2
device 3 : sms 46 Capabilities 7.5, SmClock 1815.0 Mhz, MemSize (Mb) 7952, MemClock 7000.0 Mhz, Ecc=0, boardGroupID=3
Using device 0

Testing single precision
Loading image data/one_28x28.pgm
Performing forward propagation …
Testing cudnnGetConvolutionForwardAlgorithm …
Fastest algorithm is Algo 0
Testing cudnnFindConvolutionForwardAlgorithm …
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.008576 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.047104 time requiring 3464 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.065408 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.070208 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.082080 time requiring 207360 memory
Cublas failure
Error code 0
gemv.h:77
Aborting…

May cuda9.0 can’t work well on 2080?
I find this warning when create container and it may be the reason:

===========
== MXNet ==

NVIDIA Release 18.08 (build 599768)

Container image Copyright © 2018, NVIDIA CORPORATION. All rights reserved.
Copyright © 2015-2016 by MXNet Contributors

Various files include modifications © NVIDIA CORPORATION. All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying project or file.
WARNING: Detected NVIDIA GeForce RTX 2080 GPU, which is not yet supported in this version of the container
WARNING: Detected NVIDIA GeForce RTX 2080 GPU, which is not yet supported in this version of the container
WARNING: Detected NVIDIA GeForce RTX 2080 GPU, which is not yet supported in this version of the container
WARNING: Detected NVIDIA GeForce RTX 2080 GPU, which is not yet supported in this version of the container
ERROR: No NVIDIA supported GPU(s) detected to run this container

When I use nvcr.io/nvidia/mxnet:18.09-py3 (this docker image use cuda10), everything works

That may very well be the case, cuda 9.0 not getting along with cc 7.5 of the RTX depending on use case or in general. Out of luck, then.

I just had tried RTX2080Ti with cuda9.0 and cudnn7.0.5,it can show the version of cudnn, but can not pass the mnistCUDNN test with the same error code as yours.Is this means that cuda9.0 and cudnn7.0.5 are not suitable with RTX 2080Ti?

I’m not sure, but cuda10 work well.

no one can fix this issue?

Looks like it still doesn’t work.

I have the 2080ti, Ubuntu 18.04, gcc: 7.4.0, CUDA: 10.1.234, cuDNN: 10.1

On the same page!

I have 2070, Ubuntu 18.04, GCC: 7.4.0, CUDA: 10.1.243 (update2), cuDNN: 7.6.5.32

Ubuntu 18.04 2080Super Cuda 10.1 cuDNN 7.6.5 gcc 7.4.0

Cublas Failure
Error code 0
gemv.h:77
Aborting…

I note that the cuDNN sample conv_sample does report ‘pass’, and RDNN seems to run properly to completion. And all the NVIDIA_CUDA-10.1_Samples that I’ve tried run correctly as far as I can tell.

use CUDA 10.2 does not work

cudnnGetVersion() : 7501 , CUDNN_VERSION from cudnn.h : 7605 (7.6.5)
Host compiler version : GCC 7.4.0
There are 1 CUDA capable devices on your machine :
device 0 : sms 68 Capabilities 7.5, SmClock 1575.0 Mhz, MemSize (Mb) 11016, MemClock 7000.0 Mhz, Ecc=0, boardGroupID=0
Using device 0

Testing single precision
Loading image data/one_28x28.pgm
Performing forward propagation …
Testing cudnnGetConvolutionForwardAlgorithm …
Fastest algorithm is Algo 0
Testing cudnnFindConvolutionForwardAlgorithm …
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.023264 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.044256 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.053312 time requiring 3464 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.071680 time requiring 203008 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.073728 time requiring 2057744 memory
Cublas failure
Error code 0