cudnn lstm is broken above driver 431.60, 'Unexpected Event status: 1 cuda'

Cuda: 10.1
cudnn: 7.6.4
os: windows 10
gpu: rtx 2060

If the model gets complicated like using more than 3 lstm layers, I’m getting ‘Unexpected Event status: 1 cuda’ randomly on both tensorflow(2.0) and pytorch(1.3). Latest drivers that I could find that don’t have any problems was 431.60 game ready and, 431.86 studio drivers.

E tensorflow/stream_executor/cuda/] Error polling for event status: failed to query event: CUDA_ERROR_LAUNCH_FAILED: unspecified launch failure
2019-12-11 12:57:15.515019: E tensorflow/stream_executor/] CUDNN_STATUS_INTERNAL_ERROR
in tensorflow/stream_executor/cuda/ 'cudnnRNNForwardTraining( cudnn.handle(), rnn_desc.handle(), model_dims.max_seq_length, input_desc.handles(), input_data.opaque(), input_h_desc.handle(), input_h_data.opaque(), input_c_desc.handle(), input_c_data.opaque(), rnn_desc.params_handle(), params.opaque(), output_desc.handles(), output_data->opaque(), output_h_desc.handle(), output_h_data->opaque(), output_c_desc.handle(), output_c_data->opaque(), workspace.opaque(), workspace.size(), reserve_space.opaque(), reserve_space.size())'
2019-12-11 12:57:15.519102: F tensorflow/core/common_runtime/gpu/] Unexpected Event status: 1

edit: I thought cudnn 7.6.5 fixed the problem, but it didn’t.

I was throwing the same error with 441.66… downgraded to 431.86 and now no issues

Had same issues with TF 2.0 and 2.1

tensorflow 2.1 / 2.0
CUDA 10.1 / 10.0
cudnn 7.6.5 / 7.4.x
windows 10
gtx 1650

Any ideas on how to solve this problem with newer drivers? Is it really a driver issue?

“Downgrading(!) the NVIDIA driver to the last stable studio driver (431.86) solved the issue.”

Either your output is very very long or batch size isn’t large enough. Try batch size of 32 and see if arrives at this fault faster…

I’ll explain what’s actually happening later, but this is the quickest solution. U need a dedicated GPU for this or else cudnnLSTM can’t work at its best.

U either have data leaning towards sparse or the recurrent sequence update is getting very very big and GPU fails to execute the malloc statement thus failing at submitting the forward direction of your RNN. There is more to this I will explain at a later time…

gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.95)
sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options)

Let me know if this worked for you. I ran into this a lot as well so this the basic conclusion I’ve come towards.

Could you please let us know if you are still facing this issue?


Hi, I’m having the same issue I think.

I tried increasing the batch size on a toy example (though that wouldn’t be possible in a real scenario) and I also tried:

from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession
config = ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.95
session = InteractiveSession(config = config)

Since your code doesn’t work on TensorFlow 2 anymore.

I’m also unable to downgrade my drivers to 431.86 studio version, I get the error message “Your system requires a Standard driver package…”

What should I do?

Can you try installing latest cuDNN version with following system settings “CUDA 10.2 and driver r440”?