Kernel not executed without any errors returned

Hi everybody.
I’m developing a CUDA application with visual studio 2005 and I’m working on a GTX 560 Ti.

CUDA is really great but I have a tricky problem. Sometimes my kernel doesn’t work (is not executed) but non error segnalation was given. If I restart my PC everything is ok for some period, then it presents the same behaviour.

I’m compiling my code for GPU Architecture sm_13 for backward compatibility, this could be a problem?

Thanks,
Nicola

Hello,

For diagnosing the problem it might help to:

  1. Use cudaGetLastError() and check its returned error directly after a kernel before any synchronization or other Cuda function, as some errors at kernel launch might be missed otherwise.
  2. Try cuda-memcheck.

Apart from that we might need a few more details to be able to help you. In the best case a very small piece of example code which causes the same errors.
Nevertheless I’ll try a blind guess. Am I understanding correctly that the kernel isn’t simply interrupted at some point but really doesn’t start at all?
In that case I suspect your kernel has an out of bounds memory access. This can cause the GPU to act erratically until restart. This includes refusing to launch kernels sometimes.

I check cudaGetLastError() after every kernel invocation.

“This can cause the GPU to act erratically until restart.”

there’s no way to force a software restart for debugging purpose?

EDIT: Solved using cuda-memcheck