I just did a fresh install of CUDA-11.0 (Driver 450.51.06) together with cuDNN 8.0.5 on a machine with Ubuntu 18.04 LTS and a Titan RTX. Head of nvidia-smi
:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.51.06 Driver Version: 450.51.06 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 TITAN RTX On | 00000000:03:00.0 Off | N/A |
| 41% 30C P8 9W / 280W | 20MiB / 24219MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
Running the CUDA samples works fine, but the cuDNN samples crash with an illegal instruction:
Executing: mnistCUDNN
cudnnGetVersion() : 8005 , CUDNN_VERSION from cudnn.h : 8005 (8.0.5)
Host compiler version : GCC 7.5.0
There are 1 CUDA capable devices on your machine :
device 0 : sms 72 Capabilities 7.5, SmClock 1770.0 Mhz, MemSize (Mb) 24219, MemClock 7001.0 Mhz, Ecc=0, boardGroupID=0
Using device 0
Testing single precision
[...]
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.024992 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.026176 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.062848 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.064416 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.101920 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.120832 time requiring 184784 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
[...]
Test passed!
Testing half precision (math in single precision)
Loading binary file data/conv1.bin
Loading binary file data/conv1.bias.bin
Loading binary file data/conv2.bin
Loading binary file data/conv2.bias.bin
Loading binary file data/ip1.bin
Loading binary file data/ip1.bias.bin
Loading binary file data/ip2.bin
Loading binary file data/ip2.bias.bin
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.025696 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.053280 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.055040 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.062816 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.072736 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.102176 time requiring 184784 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
Illegal instruction (core dumped)
Apart from the illegal instruction error, I’m confused that the times for cudnnGetConvolutionForwardAlgorithm_v7
are negative and fixed at -1.0
.
Does someone have a clue what is going on or knows how to debug this? If you need additional information about the system or output from another program, please let me know.