sample mnistCUDNN failed with 'Cublas failure gemv.h:77' Ubuntu 18.04.3 Cuda10.1 2080S CUDNN 7.6.5

Hi All,
I notice that other people are having a similar problem, but the proposed solutions did not help me. If someone could review my configuration and comment, I’d appreciate. Aside from mnistCUDNN, the other CUDNN samples seem to work, as well as the CUDA-10.1 samples. Thanks, /jd

Ubuntu 18.04.3 LTS

nvidia-smi
NVIDIA-SMI 435.21 Driver Version: 435.21 CUDA Version: 10.1

mnistCUDNN
cudnnGetVersion(): 7605 , CUDNN_VERSION from cudnn.h : 7605 (7.6.5)
Host compiler version : GCC 7.4.0
There are 1 CUDA capable devices on your machine :
device 0 : sms 48 Capabilities 7.5
using device 0
Testing single precision
Loading image data/one_28x28.pgm
Performing forward propagation …
Testing cudnnConvolutionForwardAlgorithm …
Fastest algorithm is Algo 0
Testing cudnnFindConvolutionForwardAlgorithm …
^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.010240 time requiring 0 memory
^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.026688 tim requiring 57600 memory
^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.027552 time requiring 3464 memory
^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.045056 time requiring 203008 memory
^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.053024 time requiring 205774 memory
Cublas failure
Error code 0
gemv.h:77
aborting…
api.log (41.4 KB)

See this topic for several other people having the same problem with very similar configuration as mine. The topic was marked resolved with the suggestion of moving from CUDA 9 to CUDA 10. Since I am already running CUDA 10, this solution doesn’t solve my problem. Thanks! /jd

https://devtalk.nvidia.com/default/topic/1042638/is-nvidia-driver-410-57-incompatible-with-cuda-9-0/?offset=6#5289184

Should I install CUDA 10.2 and the corresponding CUDNN 7.6.5 (the download page implies that there is a 10.1 version and a 10.2 version of 7.6.5) ? If I do this, I think I will need to update my display driver to 440, from my current 435.21. 435.21 is what my Ubuntu distro recommends, through the activities->software_update->additional_drivers menu (which does not offer 440). Installing a new display driver will make me cry, as I know this will cause 10 new things to go wrong and might not even solve my problem at hand. Thanks! /jd

1 Like

OK, poured myself a stiff drink and tried updating my display driver to 440. nvidia-smi says I am now running 440.44, and CUDA version 10.2. One thing that amazes me is that my screen didn’t go black and I didn’t have to re-install Ubuntu, for the first time ever when updating display driver. One thing that surprises me is that the CUDA version spontaneously updated from 10.1 to 10.2, even though I didn’t do a CUDA update. Is this expected?

Anyway, my problem changed but did not go away. Now I get gemv.h:81 rather than gemv.h:77 before the abort. Am I making progress? Is this what computer science is like? Sigh… I think it’s time for me to go for a walk in the woods and consider taking up knitting as a hobby rather than massively parallel computer programming.

Just for grins, you might say, I installed the CUDA-10.2 version of CUDNN 7.6.5 now that I seem to have installed CUDA 10.2. mnistCUDNN still aborts after the Algo 7 status line, but now the gemv.h return code has switched back to 77 from 81. So I’m back where I started, although I think I have the latest versions of everything:

CUDNN 7.6.5 CUDA 10.2 verstion
Display Driver 440.44
CUDA 10.2

Hi,

Could you please share info regarding the GPU model used in this case and if possible API logs as well?
Setting API logging:
https://docs.nvidia.com/deeplearning/sdk/cudnn-developer-guide/index.html#api-logging

Thanks

I have the Nvidia branded GEFORCE RTX 2080 Super, it lights up green. ‘Inspired by Gamers, Built by Nvidia’. How do I send you the log file? I don’t see a method to attach files to posts in this forum. It’s 871 lines long. I think the function checkCublasError called on line 71 of gemv.h is throwing the error. When I comment it out, the program runs farther but reports as below. I don’t see checkCublasError in the api log whether or not it is commented out. thnx, /jd

Resulting weights from Softmax:
0.0992543 0.0980544 0.1002826 0.0994706 0.1003590 0.1006492 0.0989232 0.0995440 0.1034657 0.0999969
Loading image data/three_28x28.pgm
Performing forward propagation …
Resulting weights from Softmax:
0.0992543 0.0980544 0.1002826 0.0994706 0.1003590 0.1006492 0.0989232 0.0995440 0.1034657 0.0999969
Loading image data/five_28x28.pgm
Performing forward propagation …
Resulting weights from Softmax:
0.0992543 0.0980544 0.1002826 0.0994706 0.1003590 0.1006492 0.0989232 0.0995440 0.1034657 0.0999969

Result of classification: 8 8 8

Test failed!
Prediction mismatch
mnistCUDNN.cpp:876
Aborting…

I got smart enough for a moment to notice the paperclip at the top of this topic. Please find attached file api.log in my first post, that does seem to open as text.

Hi,

We tried with the same versions of OS/Driver/CUDA Toolkit, but we’re unable to reproduce this issue.

It’s seems a setup issue or mis-configuration issue. I recommend to try it again with clean installation:
e.g.
(1): Re-install the CUDA Toolkit (e.g. CUDA 10.2) and cudnn-7.6.5 GA CUDA 10.2 based build.
(2): Re-build “mnistCUDNN” sample ; (e.g. $make clean ; $make or $make SMS=75 <=> in case Makefile doesn’t have “sm_75” architecture support for some reason.)

If this issue persists, please provide the following output information:

$ldd mnistCUDNN  
$dpkg -l | grep  -i cudnn
$dpkg -l | grep  -i cuda

Thanks

Excelsior! Thanks for the quick response, Sunil. Now I can make my DNNs much more convoluted.

I re-installed everything yet again. I think this time I saved the deb files and sudo dpkg’ed them rather than executing them from the the cudnn Download page. I don’t know if that made any difference.

sudo sh cuda_10.2.89_440.33.01_linux.run
/usr/local/cuda -> /usr/local/cuda-10.2/
checked that Mandelbrot works
sudo dpkg -i libcudnn7_7.6.5.32-1+cuda10.2_amd64.deb
sudo dpkg -i libcudnn7-dev_7.6.5.32-1+cuda10.2_amd64.debsudo dpkg -i libcudnn7-doc_7.6.5.32-1+cuda10.2_amd64.deb

A few cp’s that the interwebs suggested. The first seemed to be necessary as mnistCUDNN wouldn’t make beforehand. The second I’m not sure about.
sudo cp /usr/local/cuda-10.2/targets/x86_64-linux/include/cudnn.h /usr/local/cuda/include/
sudo cp /usr/local/cuda-10.2/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

A couple of links suggested from a different topic on this forum:
sudo ln -s /usr/lib/x86_64-linux-gnu/libcublasLt.so.10.2.1.243 /usr/local/cuda/lib64/libcublasLt.so
sudo ln -s /usr/lib/x86_64-linux-gnu/libcublas.so.10.2.1.243 /usr/local/cuda/lib64/libcublas.so

A few additions to my .bashrc, which I’m not sure were necessary as the Makefile seems to handle this
export CUDA_HOME=/usr/local/cuda
export PATH=/usr/local/cuda/bin:$PATH
export CPATH=/usr/local/cuda/include:$CPATH
export LIBRARY_PATH=/usr/local/cuda/lib64:$LIBRARY_PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH

After working through this and a reboot, all the cudnn_10.2_samples_v7 samples compile and claim to pass.

Re-install, the universal solution to all software problems!

1 Like

ldd mnistCUDNN
linux-vdso.so.1 (0x00007ffd98bb2000)
libcudart.so.10.2 => /usr/local/cuda-10.2/targets/x86_64-linux/lib/libcudart.so.10.2 (0x00007fab2babc000)
libcublas.so.9.1 => /usr/lib/x86_64-linux-gnu/libcublas.so.9.1 (0x00007fab28525000)
libcudnn.so.7 => /usr/local/cuda-10.0/targets/x86_64-linux/lib/libcudnn.so.7 (0x00007fab104d0000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fab10147000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fab0ff28000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fab0fb8a000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fab0f972000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fab0f581000)
/lib64/ld-linux-x86-64.so.2 (0x00007fab2c19e000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fab0f37d000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fab0f175000)

dpkg -l | grep -i cudnn
ii libcudnn7 7.5.1.10-1+cuda10.1 amd64 cuDNN runtime libraries
ii libcudnn7-dev 7.5.1.10-1+cuda10.1 amd64 cuDNN development libraries and headers
ii libcudnn7-doc 7.5.1.10-1+cuda10.1 amd64 cuDNN documents and samples

dpkg -l | grep -i cuda
ii cuda 10.2.89-1 amd64 CUDA meta-package
ii cuda-10-0 10.0.130-1 amd64 CUDA 10.0 meta-package
ii cuda-10-2 10.2.89-1 amd64 CUDA 10.2 meta-package
ii cuda-command-line-tools-10-0 10.0.130-1 amd64 CUDA command-line tools
ii cuda-command-line-tools-10-2 10.2.89-1 amd64 CUDA command-line tools
ii cuda-compiler-10-0 10.0.130-1 amd64 CUDA compiler
ii cuda-compiler-10-2 10.2.89-1 amd64 CUDA compiler
ii cuda-cublas-10-0 10.0.130-1 amd64 CUBLAS native runtime libraries
ii cuda-cublas-dev-10-0 10.0.130-1 amd64 CUBLAS native dev links, headers
ii cuda-cudart-10-0 10.0.130-1 amd64 CUDA Runtime native Libraries
rc cuda-cudart-10-1 10.1.168-1 amd64 CUDA Runtime native Libraries
ii cuda-cudart-10-2 10.2.89-1 amd64 CUDA Runtime native Libraries
ii cuda-cudart-dev-10-0 10.0.130-1 amd64 CUDA Runtime native dev links, headers
rc cuda-cudart-dev-10-1 10.1.168-1 amd64 CUDA Runtime native dev links, headers
ii cuda-cudart-dev-10-2 10.2.89-1 amd64 CUDA Runtime native dev links, headers
ii cuda-cufft-10-0 10.0.130-1 amd64 CUFFT native runtime libraries
rc cuda-cufft-10-1 10.1.168-1 amd64 CUFFT native runtime libraries
ii cuda-cufft-10-2 10.2.89-1 amd64 CUFFT native runtime libraries
ii cuda-cufft-dev-10-0 10.0.130-1 amd64 CUFFT native dev links, headers
ii cuda-cufft-dev-10-2 10.2.89-1 amd64 CUFFT native dev links, headers
ii cuda-cuobjdump-10-0 10.0.130-1 amd64 CUDA cuobjdump
ii cuda-cuobjdump-10-2 10.2.89-1 amd64 CUDA cuobjdump
ii cuda-cupti-10-0 10.0.130-1 amd64 CUDA profiling tools interface.
rc cuda-cupti-10-1 10.1.168-1 amd64 CUDA profiling tools interface.
ii cuda-cupti-10-2 10.2.89-1 amd64 CUDA profiling tools runtime libs.
ii cuda-cupti-dev-10-2 10.2.89-1 amd64 CUDA profiling tools interface.
ii cuda-curand-10-0 10.0.130-1 amd64 CURAND native runtime libraries
rc cuda-curand-10-1 10.1.168-1 amd64 CURAND native runtime libraries
ii cuda-curand-10-2 10.2.89-1 amd64 CURAND native runtime libraries
ii cuda-curand-dev-10-0 10.0.130-1 amd64 CURAND native dev links, headers
ii cuda-curand-dev-10-2 10.2.89-1 amd64 CURAND native dev links, headers
ii cuda-cusolver-10-0 10.0.130-1 amd64 CUDA solver native runtime libraries
rc cuda-cusolver-10-1 10.1.168-1 amd64 CUDA solver native runtime libraries
ii cuda-cusolver-10-2 10.2.89-1 amd64 CUDA solver native runtime libraries
ii cuda-cusolver-dev-10-0 10.0.130-1 amd64 CUDA solver native dev links, headers
ii cuda-cusolver-dev-10-2 10.2.89-1 amd64 CUDA solver native dev links, headers
ii cuda-cusparse-10-0 10.0.130-1 amd64 CUSPARSE native runtime libraries
rc cuda-cusparse-10-1 10.1.168-1 amd64 CUSPARSE native runtime libraries
ii cuda-cusparse-10-2 10.2.89-1 amd64 CUSPARSE native runtime libraries
ii cuda-cusparse-dev-10-0 10.0.130-1 amd64 CUSPARSE native dev links, headers
ii cuda-cusparse-dev-10-2 10.2.89-1 amd64 CUSPARSE native dev links, headers
ii cuda-demo-suite-10-0 10.0.130-1 amd64 Demo suite for CUDA
ii cuda-demo-suite-10-2 10.2.89-1 amd64 Demo suite for CUDA
ii cuda-documentation-10-0 10.0.130-1 amd64 CUDA documentation
ii cuda-documentation-10-2 10.2.89-1 amd64 CUDA documentation
ii cuda-driver-dev-10-0 10.0.130-1 amd64 CUDA Driver native dev stub library
ii cuda-driver-dev-10-2 10.2.89-1 amd64 CUDA Driver native dev stub library
ii cuda-drivers 440.33.01-1 amd64 CUDA Driver meta-package
ii cuda-gdb-10-0 10.0.130-1 amd64 CUDA-GDB
ii cuda-gdb-10-2 10.2.89-1 amd64 CUDA-GDB
ii cuda-gpu-library-advisor-10-0 10.0.130-1 amd64 CUDA GPU Library Advisor.
ii cuda-libraries-10-0 10.0.130-1 amd64 CUDA Libraries 10.0 meta-package
ii cuda-libraries-10-2 10.2.89-1 amd64 CUDA Libraries 10.2 meta-package
ii cuda-libraries-dev-10-0 10.0.130-1 amd64 CUDA Libraries 10.0 development meta-package
ii cuda-libraries-dev-10-2 10.2.89-1 amd64 CUDA Libraries 10.2 development meta-package
ii cuda-license-10-0 10.0.130-1 amd64 CUDA licenses
ii cuda-license-10-2 10.2.89-1 amd64 CUDA licenses
ii cuda-memcheck-10-0 10.0.130-1 amd64 CUDA-MEMCHECK
ii cuda-memcheck-10-2 10.2.89-1 amd64 CUDA-MEMCHECK
ii cuda-misc-headers-10-0 10.0.130-1 amd64 CUDA miscellaneous headers
ii cuda-misc-headers-10-2 10.2.89-1 amd64 CUDA miscellaneous headers
ii cuda-npp-10-0 10.0.130-1 amd64 NPP native runtime libraries
rc cuda-npp-10-1 10.1.168-1 amd64 NPP native runtime libraries
ii cuda-npp-10-2 10.2.89-1 amd64 NPP native runtime libraries
ii cuda-npp-dev-10-0 10.0.130-1 amd64 NPP native dev links, headers
ii cuda-npp-dev-10-2 10.2.89-1 amd64 NPP native dev links, headers
ii cuda-nsight-10-0 10.0.130-1 amd64 CUDA nsight
ii cuda-nsight-10-2 10.2.89-1 amd64 CUDA nsight
ii cuda-nsight-compute-10-0 10.0.130-1 amd64 NVIDIA Nsight Compute
rc cuda-nsight-compute-10-1 10.1.168-1 amd64 NVIDIA Nsight Compute
ii cuda-nsight-compute-10-2 10.2.89-1 amd64 NVIDIA Nsight Compute
rc cuda-nsight-systems-10-1 10.1.168-1 amd64 NVIDIA Nsight Systems
ii cuda-nsight-systems-10-2 10.2.89-1 amd64 NVIDIA Nsight Systems
ii cuda-nvcc-10-0 10.0.130-1 amd64 CUDA nvcc
rc cuda-nvcc-10-1 10.1.168-1 amd64 CUDA nvcc
ii cuda-nvcc-10-2 10.2.89-1 amd64 CUDA nvcc
ii cuda-nvdisasm-10-0 10.0.130-1 amd64 CUDA disassembler
ii cuda-nvdisasm-10-2 10.2.89-1 amd64 CUDA disassembler
ii cuda-nvgraph-10-0 10.0.130-1 amd64 NVGRAPH native runtime libraries
rc cuda-nvgraph-10-1 10.1.168-1 amd64 NVGRAPH native runtime libraries
ii cuda-nvgraph-10-2 10.2.89-1 amd64 NVGRAPH native runtime libraries
ii cuda-nvgraph-dev-10-0 10.0.130-1 amd64 NVGRAPH native dev links, headers
ii cuda-nvgraph-dev-10-2 10.2.89-1 amd64 NVGRAPH native dev links, headers
ii cuda-nvjpeg-10-0 10.0.130-1 amd64 NVJPEG native runtime libraries
rc cuda-nvjpeg-10-1 10.1.168-1 amd64 NVJPEG native runtime libraries
ii cuda-nvjpeg-10-2 10.2.89-1 amd64 NVJPEG native runtime libraries
ii cuda-nvjpeg-dev-10-0 10.0.130-1 amd64 NVJPEG native dev links, headers
ii cuda-nvjpeg-dev-10-2 10.2.89-1 amd64 NVJPEG native dev links, headers
ii cuda-nvml-dev-10-0 10.0.130-1 amd64 NVML native dev links, headers
ii cuda-nvml-dev-10-2 10.2.89-1 amd64 NVML native dev links, headers
ii cuda-nvprof-10-0 10.0.130-1 amd64 CUDA Profiler tools
rc cuda-nvprof-10-1 10.1.168-1 amd64 CUDA Profiler tools
ii cuda-nvprof-10-2 10.2.89-1 amd64 CUDA Profiler tools
ii cuda-nvprune-10-0 10.0.130-1 amd64 CUDA nvprune
ii cuda-nvprune-10-2 10.2.89-1 amd64 CUDA nvprune
ii cuda-nvrtc-10-0 10.0.130-1 amd64 NVRTC native runtime libraries
rc cuda-nvrtc-10-1 10.1.168-1 amd64 NVRTC native runtime libraries
ii cuda-nvrtc-10-2 10.2.89-1 amd64 NVRTC native runtime libraries
ii cuda-nvrtc-dev-10-0 10.0.130-1 amd64 NVRTC native dev links, headers
ii cuda-nvrtc-dev-10-2 10.2.89-1 amd64 NVRTC native dev links, headers
ii cuda-nvtx-10-0 10.0.130-1 amd64 NVIDIA Tools Extension
rc cuda-nvtx-10-1 10.1.168-1 amd64 NVIDIA Tools Extension
ii cuda-nvtx-10-2 10.2.89-1 amd64 NVIDIA Tools Extension
ii cuda-nvvp-10-0 10.0.130-1 amd64 CUDA nvvp
ii cuda-nvvp-10-2 10.2.89-1 amd64 CUDA nvvp
ii cuda-repo-ubuntu1804 10.0.130-1 amd64 cuda repository configuration files
ii cuda-runtime-10-0 10.0.130-1 amd64 CUDA Runtime 10.0 meta-package
ii cuda-runtime-10-2 10.2.89-1 amd64 CUDA Runtime 10.2 meta-package
ii cuda-samples-10-0 10.0.130-1 amd64 CUDA example applications
ii cuda-samples-10-2 10.2.89-1 amd64 CUDA example applications
rc cuda-sanitizer-api-10-1 10.1.168-1 amd64 CUDA Sanitizer API
ii cuda-sanitizer-api-10-2 10.2.89-1 amd64 CUDA Sanitizer API
ii cuda-toolkit-10-0 10.0.130-1 amd64 CUDA Toolkit 10.0 meta-package
rc cuda-toolkit-10-1 10.1.168-1 amd64 CUDA Toolkit 10.1 meta-package
ii cuda-toolkit-10-2 10.2.89-1 amd64 CUDA Toolkit 10.2 meta-package
ii cuda-tools-10-0 10.0.130-1 amd64 CUDA Tools meta-package
ii cuda-tools-10-2 10.2.89-1 amd64 CUDA Tools meta-package
ii cuda-visual-tools-10-0 10.0.130-1 amd64 CUDA visual tools
rc cuda-visual-tools-10-1 10.1.168-1 amd64 CUDA visual tools
ii cuda-visual-tools-10-2 10.2.89-1 amd64 CUDA visual tools
ii libcudart9.1:amd64 9.1.85-3ubuntu1 amd64 NVIDIA CUDA Runtime Library
ii libcudnn7 7.5.1.10-1+cuda10.1 amd64 cuDNN runtime libraries
ii libcudnn7-dev 7.5.1.10-1+cuda10.1 amd64 cuDNN development libraries and headers
ii libcudnn7-doc 7.5.1.10-1+cuda10.1 amd64 cuDNN documents and samples
ii libnvrtc9.1:amd64 9.1.85-3ubuntu1 amd64 CUDA Runtime Compilation (NVIDIA NVRTC Library)
ii nvidia-cuda-dev 9.1.85-3ubuntu1 amd64 NVIDIA CUDA development files
ii nvidia-cuda-doc 9.1.85-3ubuntu1 all NVIDIA CUDA and OpenCL documentation
ii nvidia-cuda-gdb 9.1.85-3ubuntu1 amd64 NVIDIA CUDA Debugger (GDB)
ii nvidia-cuda-toolkit 9.1.85-3ubuntu1 amd64 NVIDIA CUDA development toolkit
ii nvidia-profiler 9.1.85-3ubuntu1 amd64 NVIDIA Profiler for CUDA and OpenCL
ii nvidia-visual-profiler 9.1.85-3ubuntu1 amd64 NVIDIA Visual Profiler for CUDA and OpenCL