Cuda driver 320.00 error

Hi All,

I’m getting a CUDA error “unspecified launch failure” that causes the driver to go into an error state, does anyone knows how to recover the CUDA driver from it? or if I can set some kind of verbose in it for more information?

This is the nvidia-smi dump:
C:\Program Files\NVIDIA Corporation\NVSMI>nvidia-smi.exe
Wed Sep 04 16:14:23 2013
±-----------------------------------------------------+
| NVIDIA-SMI 5.320.00 Driver Version: 320.00 |
|-------------------------------±---------------------±---------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 ERR! TCC | ERR! ERR! | ERR! |
|ERR! ERR! ERR! ERR! / ERR! | 10MB / 6143MB | ERR! ERR! |
±------------------------------±---------------------±---------------------+
| 1 Quadro FX 3800 WDDM | 0000:28:00.0 N/A | N/A |
| 30% 78C N/A N/A / N/A | 997MB / 998MB | N/A Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Compute processes: GPU Memory |
| GPU PID Process name Usage |
|=============================================================================|
| 0 ERROR: GPU is lost |
| 1 Not Supported |
±----------------------------------------------------------------------------+

C:\Program Files\NVIDIA Corporation\NVSMI>nvidia-smi.exe -q

==============NVSMI LOG==============

Timestamp : Wed Sep 04 16:20:49 2013
Driver Version : 320.00

Attached GPUs : 2
Unable to determine the PCI bus id for the target device: GPU is lost

“unspecified launch failure” is the GPU equivalent of a “segfault” on CPUs, meaning there is an out-of-bounds memory access. I would suggest use of cuda-memcheck to track down the out-of-bounds access. The out-of-bounds access may also be a follow-on error caused by a failing CUDA API call upstream (e.g. a failed memory allocation), so may sure to check the return status of all CUDA API calls.