Failed to get convolution algorithm. This is probably because cuDNN failed to initialize

So I tried TF 1.8, but that won’t work with CUDA 10.0.
$ pip3 install --upgrade tensorflow-gpu==1.8.0

[…]

~/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py in
72 for some common reasons and solutions. Include the entire stack trace
73 above this error message when asking for help.“”" % traceback.format_exc()
—> 74 raise ImportError(msg)
75
76 # pylint: enable=wildcard-import,g-import-not-at-top,unused-import,line-too-long

ImportError: Traceback (most recent call last):
File “/home/mkg/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py”, line 58, in
from tensorflow.python.pywrap_tensorflow_internal import *
File “/home/mkg/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py”, line 28, in
_pywrap_tensorflow_internal = swig_import_helper()
File “/home/mkg/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py”, line 24, in swig_import_helper
_mod = imp.load_module(‘_pywrap_tensorflow_internal’, fp, pathname, description)
File “/usr/lib/python3.6/imp.py”, line 243, in load_module
return load_dynamic(name, filename, file)
File “/usr/lib/python3.6/imp.py”, line 343, in load_dynamic
return _load(spec)
ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory

cuDNN:
/usr/lib/x86_64-linux-gnu/libcudnn_static.a
/usr/lib/x86_64-linux-gnu/libcudnn.so.7
/usr/lib/x86_64-linux-gnu/libcudnn.so
/usr/lib/x86_64-linux-gnu/libcudnn_static_v7.a
/usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.2

Nope. Did not fix the problem.

No it does not, actually.

tensorflow 1.13.1 with cudnn 7.4.1

~/.local/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py in exit(self, type_arg, value_arg, traceback_arg)
526 None, None,
527 compat.as_text(c_api.TF_Message(self.status.status)),
→ 528 c_api.TF_GetCode(self.status.status))
529 # Delete the underlying status object from memory otherwise it stays alive
530 # as there is a reference to status from this from the traceback due to

UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node conv1d_1/convolution/Conv2D}}]]

name: GeForce GTX 1660 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.59
pciBusID: 0000:01:00.0
totalMemory: 5.80GiB freeMemory: 5.23GiB
2019-09-05 18:13:23.621659: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-09-05 18:13:23.624609: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-09-05 18:13:23.624657: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
2019-09-05 18:13:23.624676: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
2019-09-05 18:13:23.624844: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5060 MB memory) → physical GPU (device: 0, name: GeForce GTX 1660 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5)
2019-09-05 18:13:24.754500: I tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally
2019-09-05 18:13:25.691315: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-09-05 18:13:25.710174: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

So after some more experimentation, a reboot and the following sequence made the 1D convolution work.

import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
tf.keras.backend.set_session(tf.Session(config=config))

The thing to highlight is that this required a full reboot, and was the first sequence executed.

This did not work previously when I tried without a reboot. Even shutting down and restarting jupyter notebook did not help.

Here’s what I have installed for reference, with a GTX 1660 Ti on an ASUS ROG Strix laptop under Ubuntu 18.04.

$ sudo dpkg -i libcudnn7_7.4.1.5-1+cuda10.0_amd64.deb libcudnn7-dev_7.4.1.5-1+cuda10.0_amd64.deb libcudnn7-doc_7.4.1.5-1+cuda10.0_amd64.deb
$ pip3 install --upgrade tensorflow-gpu==1.13.1
$ nvidia-smi
Sat Sep 7 12:02:49 2019
±----------------------------------------------------------------------------+
| NVIDIA-SMI 430.40 Driver Version: 430.40 CUDA Version: 10.1 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 166… Off | 00000000:01:00.0 On | N/A |
| N/A 52C P0 33W / N/A | 5011MiB / 5944MiB | 17% Default |
±------------------------------±---------------------±---------------------+

==============================================================================

[1]
import tensorflow as tf

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
tf.keras.backend.set_session(tf.Session(config=config))

from keras.models import Sequential
from keras import layers
from keras.optimizers import RMSprop

Full execution example =>

https://devtalk.nvidia.com/default/topic/1048456/cudnn/-quot-failed-to-get-convolution-algorithm-quot-problem/post/5381714/#5381714

1 Like

Yes, this worked for me ! Thanks a lot.

Use this code after importing tensorflow library as tf

import tensorflow as tf
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.compat.v1.Session(config=config)

2 Likes

Hi,
Even I have an RTX 2070.
I was using TensorFlow-GPU a month back and everything was working properly.
Then an update came for the GPU drivers then my Tensorflow-GPU stopped working.
I re-installed all my drivers by TensorFlow is still not able to recognize my GPU.
My CUDA got updated to 11.3 and I don’t think there is a compatible version for cudnn yet.
Any suggestions on how to fix this?
How did you fix your issue?

Downgrade your cuda to 11.2 if you are using tensorflow 2.5, 11.3 is not supported yet, use this table for reference : 從原始碼開始建構  |  TensorFlow

1 Like

I confirm this worked for me, using tf 1.15, cuda 10.0 and cudnn 7.6.5 on an RTX 2070

2 Likes