Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

Hello,

I’m trying to compile, for cuDDN bu i got this error here. Can some one help me with that?

2019-05-06 18:23:55.356327: I tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally
2019-05-06 18:23:58.001212: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-05-06 18:23:58.038416: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
Traceback (most recent call last):
File “test.py”, line 41, in
model.fit(x, y, epochs=20, batch_size=n_batch)
File “/usr/local/lib/python3.5/dist-packages/tensorflow/python/keras/engine/training.py”, line 880, in fit
validation_steps=validation_steps)
File “/usr/local/lib/python3.5/dist-packages/tensorflow/python/keras/engine/training_arrays.py”, line 329, in model_iteration
batch_outs = f(ins_batch)
File “/usr/local/lib/python3.5/dist-packages/tensorflow/python/keras/backend.py”, line 3076, in call
run_metadata=self.run_metadata)
File “/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py”, line 1439, in call
run_metadata_ptr)
File “/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py”, line 528, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node conv2d/Conv2D}}]]
[[{{node ConstantFoldingCtrl/loss/dense_loss/broadcast_weights/assert_broadcastable/AssertGuard/Switch_0}}]]

I was also facing the same issue with RTX 2060, TF: 1.13.1, cudnn 7.6, CUDA 10.1:
Following resolved my issue:

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)

Tried the suggested ConfigProto, but did not work for me for the longest time.

So after some more experimentation, a reboot and the following sequence made the 1D convolution work.

import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
tf.keras.backend.set_session(tf.Session(config=config))

The thing to highlight is that this required a full reboot, and was the first sequence executed.

This did not work previously when I tried without a reboot. Even shutting down and restarting jupyter notebook did not help.

Here’s what I have installed for reference, with a GTX 1660 Ti on an ASUS ROG Strix laptop under Ubuntu 18.04.

$ sudo dpkg -i libcudnn7_7.4.1.5-1+cuda10.0_amd64.deb libcudnn7-dev_7.4.1.5-1+cuda10.0_amd64.deb libcudnn7-doc_7.4.1.5-1+cuda10.0_amd64.deb
$ pip3 install --upgrade tensorflow-gpu==1.13.1
$ nvidia-smi
Sat Sep 7 12:02:49 2019
±----------------------------------------------------------------------------+
| NVIDIA-SMI 430.40 Driver Version: 430.40 CUDA Version: 10.1 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 166… Off | 00000000:01:00.0 On | N/A |
| N/A 52C P0 33W / N/A | 5011MiB / 5944MiB | 17% Default |
±------------------------------±---------------------±---------------------+

==============================================================================

[1]
import tensorflow as tf

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
tf.keras.backend.set_session(tf.Session(config=config))

from keras.models import Sequential
from keras import layers
from keras.optimizers import RMSprop

Full execution example =>

https://devtalk.nvidia.com/default/topic/1048456/cudnn/-quot-failed-to-get-convolution-algorithm-quot-problem/post/5381714/#5381714

X-Ref:

https://devtalk.nvidia.com/default/topic/1062664/cudnn/problem-with-1d-convolutions-under-keras-/

https://devtalk.nvidia.com/default/topic/1062190/cudnn/cudnn-failed-to-initialize/

https://devtalk.nvidia.com/default/topic/1043867/cudnn/failed-to-get-convolution-algorithm-this-is-probably-because-cudnn-failed-to-initialize/

https://devtalk.nvidia.com/default/topic/1055928/cudnn/error-failed-to-get-convolution-algorithm-this-is-probably-because-cudnn-failed-to-initialize-so-/

@michael.gschwind Thank you for posting your solution, it fixed this issue for me on Windows 10 (while just passing the config thru the tf.Session call did not work).