Cuda error launch failed

I use Matlab and its Experiment Manager to train neural networks. Everything was fine but in one session there was an error. Afterwards the gpu is not usable for neural network training. See details below.

Environment

Matlab 2021a

GPU Type : RTX 3080
Nvidia Driver Version : 471.86
CUDA Version : 11.4
CUDNN Version : 11.4
Operating System + Version : Windows 10 home 20H2

  1. I ran Experiment Manager in Matlab without a problem for days. Then, in one session, an "Out of Memory: error occurred.
  2. Rerun the same experiment, the “Cuda_error_launch_failed” error was shown and cannot continue.
  3. All previous working Matlab deep learning programs stop working with the same “Cuda_error_launch_failed” error.
  4. Delete all files from the OS drive. Reinstalled OS Matlab, RTX 3080 driver, Cuda toolkit, CuDNN. The “Cuda_error_launch_failed” error still occurs. Repeat the installation multiple times, but the problem cannot be solved.

Please suggest some tools that we can use to test and verify that there is no GPU hardware issues.

has a cold boot (extended power off for tens of seconds) been attempted?

Good morning Christian,

Yes. Powered off overnight.

Just tried again. I unplugged the machine and held the power button for over 20 seconds to discharge. But I still have the same cuda_error_launch_failed after reboot.

Is there any tool that I can test for hardware failure?
Thanks,
Frank

gpu-z shows GPU capabilities and may run a small selection of benchmarks

then there’s also tools like CompuBench for Windows that can measure GPU performance (OpenCL and CUDA)

https://compubench.com/windows-download-2.0/

Great! Thanks. Will try.