Failed to synchronize the stop event: CUDA_ERROR_LAUNCH_TIMEOUT: the launch timed out and was terminated

When training my model, I get an error belonging to CUDA multiple times. For certain settings the model works. However, when I increase the kernel size for the second, third or fourth layer of my neural network this error appears. Does someone know what the reason for this could be and how to solve it?

2021-04-20 16:54:52.765977: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to synchronize the stop event: CUDA_ERROR_LAUNCH_TIMEOUT: the launch timed out and was terminated
2021-04-20 16:54:52.766337: E tensorflow/stream_executor/gpu/gpu_timer.cc:55] Internal: Error destroying CUDA event: CUDA_ERROR_LAUNCH_TIMEOUT: the launch timed out and was terminated
2021-04-20 16:54:52.766626: E tensorflow/stream_executor/gpu/gpu_timer.cc:60] Internal: Error destroying CUDA event: CUDA_ERROR_LAUNCH_TIMEOUT: the launch timed out and was terminated
2021-04-20 16:54:52.767194: I tensorflow/stream_executor/cuda/cuda_driver.cc:789] failed to allocate 8B (8 bytes) from device: CUDA_ERROR_LAUNCH_TIMEOUT: the launch timed out and was terminated
2021-04-20 16:54:52.767484: E tensorflow/stream_executor/stream.cc:5011] Internal: Failed to enqueue async memset operation: CUDA_ERROR_LAUNCH_TIMEOUT: the launch timed out and was terminated
2021-04-20 16:54:52.768078: W tensorflow/core/kernels/gpu_utils.cc:69] Failed to check cudnn convolutions for out-of-bounds reads and writes with an error message: ‘Failed to load in-memory CUBIN: CUDA_ERROR_LAUNCH_TIMEOUT: the launch timed out and was terminated’; skipping this check. This only means that we won’t check cudnn for out-of-bounds reads and writes. This message will only be printed once.
2021-04-20 16:54:52.769020: I tensorflow/stream_executor/cuda/cuda_driver.cc:789] failed to allocate 8B (8 bytes) from device: CUDA_ERROR_LAUNCH_TIMEOUT: the launch timed out and was terminated
2021-04-20 16:54:52.769466: E tensorflow/stream_executor/stream.cc:5011] Internal: Failed to enqueue async memset operation: CUDA_ERROR_LAUNCH_TIMEOUT: the launch timed out and was terminated
2021-04-20 16:54:52.769936: F tensorflow/stream_executor/cuda/cuda_dnn.cc:189] Check failed: status == CUDNN_STATUS_SUCCESS (7 vs. 0)Failed to set cuDNN stream.

Hi @d.a.heesmans ,
Can you please help us with the environment you are trying this on?

Hi,

The environment I am using:
Windows 10
Python 3.8.8
Tensorflow 2.4.1
GPU: NVIDIA Quadro M1200

Hi @AakankshaS,

Is the information above the information you need to be able to try to help with my issue?
If not, what is the extra information you need?

Kind Regards,
Dennis

Hi @d.a.heesmans ,
This looks like the issue with tensorflow.
Can you please downgrading the tf version and try again.
In case if the issue persist, we recommend you to raise this in tensorflow forum.

Thanks!