launching fail detection

Greetings!

If I call utility functions from CUDA, I can use the CUDA_SAFE_CALL() to collect error messages with ease. But how can I detect the launching error of a user-defined kernel function? I means, functions like func<<<dimGrid,dimBlock>>>().

I would like to know whether a certain kernel is correctly launched, or better, what’s the reason of a failure? Say, register memory exceeds or what. Can someone please tell me how to check?

Thank you!

cudaThreadSynchronize() after the kernel and check its return value or the return value of cudaGetLastError().

The CUT_CHECK_ERROR macro does this in debug builds.