Problem with 1D convolutions under keras:

$ nvidia-smi
Thu Sep 5 11:20:56 2019
±----------------------------------------------------------------------------+
| NVIDIA-SMI 430.40 Driver Version: 430.40 CUDA Version: 10.1 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 166… Off | 00000000:01:00.0 On | N/A |
| N/A 42C P8 5W / N/A | 4375MiB / 5944MiB | 1% Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1992 G /usr/lib/xorg/Xorg 219MiB |
| 0 2219 G /usr/bin/gnome-shell 96MiB |
| 0 6431 G …quest-channel-token=8160790192539462645 46MiB |
| 0 6877 C /usr/bin/python3 4009MiB |
±----------------------------------------------------------------------------+

When running a shis has been a Conv1D issue for a while.
I could get one of the examples from Francois Chollet’s book (Listing 6.46) to work after rebooting my system. Then, voila, the next example fails (Listing 6.46).

This is with a GeForce GTX 1660 card in a laptop running ubuntu 18.04, cuDNN 10.0, Python 3.6 and tensorflow-gpu

===================================== stacked 1D Conv network =============================
from keras.models import Sequential
from keras import layers
from keras.optimizers import RMSprop

model = Sequential()
model.add(layers.Conv1D(32,5,activation=‘relu’,
input_shape=(None, float_data.shape[-1])))
model.add(layers.MaxPooling1D(3))
model.add(layers.Conv1D(32,5,activation=‘relu’))
model.add(layers.MaxPooling1D(3))
model.add(layers.Conv1D(32,5,activation=‘relu’))
model.add(layers.GlobalMaxPooling1D())
model.add(layers.Dense(1))

model.compile(optimizer=RMSprop(), loss=‘mae’)

history = model.fit_generator(train_gen,
steps_per_epoch=500,
epochs = 20,
validation_data=val_gen,
validation_steps = val_steps)

~/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py in call(self, *args, **kwargs)
1456 ret = tf_session.TF_SessionRunCallable(self._session._session,
1457 self._handle, args,
-> 1458 run_metadata_ptr)
1459 if run_metadata:
1460 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

UnknownError: 2 root error(s) found.
(0) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node conv1d_1/convolution}}]]
[[loss/mul/_71]]
(1) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node conv1d_1/convolution}}]]
0 successful operations.
0 derived errors ignored.

==========================================================================================

2019-09-05 12:02:43.301213: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-09-05 12:02:43.305059: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2019-09-05 12:02:43.409874: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-09-05 12:02:43.410234: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55b0150 executing computations on platform CUDA. Devices:
2019-09-05 12:02:43.410263: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): GeForce GTX 1660 Ti, Compute Capability 7.5
2019-09-05 12:02:43.430879: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2400000000 Hz
2019-09-05 12:02:43.431295: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4160850 executing computations on platform Host. Devices:
2019-09-05 12:02:43.431312: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): ,

2019-09-05 12:02:43.439496: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5185 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1660 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5)
2019-09-05 12:02:44.021009: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2019-09-05 12:02:44.223999: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2019-09-05 12:02:44.606974: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-09-05 12:02:44.614763: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

To clarify: Reboot does nothing for this example…

mkg@vicky:~$ nvidia-smi
Thu Sep 5 12:19:28 2019
±----------------------------------------------------------------------------+
| NVIDIA-SMI 430.40 Driver Version: 430.40 CUDA Version: 10.1 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 166… Off | 00000000:01:00.0 On | N/A |
| N/A 44C P5 10W / N/A | 5938MiB / 5944MiB | 23% Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1546 G /usr/lib/xorg/Xorg 159MiB |
| 0 1720 G /usr/bin/gnome-shell 99MiB |
| 0 2117 G …quest-channel-token=7415281734775576349 44MiB |
| 0 2416 C /usr/bin/python3 5631MiB |
±----------------------------------------------------------------------------+

~/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py in call(self, *args, **kwargs)
1456 ret = tf_session.TF_SessionRunCallable(self._session._session,
1457 self._handle, args,
-> 1458 run_metadata_ptr)
1459 if run_metadata:
1460 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

UnknownError: 2 root error(s) found.
(0) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node conv1d_1/convolution}}]]
[[loss/mul/_71]]
(1) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node conv1d_1/convolution}}]]
0 successful operations.
0 derived errors ignored.

2019-09-05 12:17:36.022981: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5263 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1660 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5)
2019-09-05 12:17:37.372288: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2019-09-05 12:17:37.712368: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2019-09-05 12:17:38.317711: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-09-05 12:17:38.321064: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

Tried the suggested ConfigProto, but did not work for me for the longest time.

So after some more experimentation, a reboot and the following sequence made the 1D convolution work.

import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
tf.keras.backend.set_session(tf.Session(config=config))

The thing to highlight is that this required a full reboot, and was the first sequence executed.

This did not work previously when I tried without a reboot. Even shutting down and restarting jupyter notebook did not help.

Here’s what I have installed for reference, with a GTX 1660 Ti on an ASUS ROG Strix laptop under Ubuntu 18.04.

sudo dpkg -i libcudnn7_7.4.1.5-1+cuda10.0_amd64.deb libcudnn7-dev_7.4.1.5-1+cuda10.0_amd64.deb libcudnn7-doc_7.4.1.5-1+cuda10.0_amd64.deb pip3 install --upgrade tensorflow-gpu==1.13.1
$ nvidia-smi
Sat Sep 7 12:02:49 2019
±----------------------------------------------------------------------------+
| NVIDIA-SMI 430.40 Driver Version: 430.40 CUDA Version: 10.1 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 166… Off | 00000000:01:00.0 On | N/A |
| N/A 52C P0 33W / N/A | 5011MiB / 5944MiB | 17% Default |
±------------------------------±---------------------±---------------------+

==============================================================================

[1]
import tensorflow as tf

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
tf.keras.backend.set_session(tf.Session(config=config))

from keras.models import Sequential
from keras import layers
from keras.optimizers import RMSprop

Full execution example =>

https://devtalk.nvidia.com/default/topic/1048456/cudnn/-quot-failed-to-get-convolution-algorithm-quot-problem/post/5381714/#5381714