How to get kernel failure instruction

FlaviusV · May 11, 2011, 8:56pm

Hi, I am currently developping quicksort implementation in OpenCL. In order to do the sorting quickly on the GPU architecture, the kernel is rather complex and surprisingly, I have a bug (or rather multiple bugs) there. It’s probably an out of bounds access to some array, but the only think I get from the failing kernel is when I do clFinish after enqueuing the kernel:

CL_INVALID_COMMAND_QUEUE error executing clFinish on GeForce GTX 580 (Device 0).

Well, that’s not much information, I’d prefer at least the PTX instruction address where this error happens (I understand that requesting the line in C code is rather too much). It is somehow possible to obtain on Linux with some version of drivers? Or do you have any other method, how to obtain more info?

Usually, when the kernel just behaves wrong (but does not fail), I could tunnel some info through the global memory, but after this failure, the queue is damaged and I cannot do anything more on it.

FlaviusV · May 13, 2011, 9:07pm

As for now, the best way I have developped is to guard all accesses to arrays by C macro-ish runtime checks in the “debug” version, where failing causes skipping the command and pushing debug info (LINE macro works :-) ) to some global memory, which can be read if the kernel does not fail (just gives incorrect results). It’s a pity these checks must be written manually, it would be cool to hack the compiler to generate them automatically.

TLM · November 5, 2013, 8:14pm

Awesome solution!
Can you please post the macro?

pasoleatis · November 5, 2013, 10:09pm

http://choorucode.com/2011/03/02/cuda-error-checking/