I am having trouble debugging a kernel. I can run the program fine when it’s been built with emulation, and the results are correct, but when I try to use the GPU I get:
Cuda error: Kernel execution failed in file ‘aes.cu’ in line 133 : unspecified launch failure.
The line referred to is the CUT_CHECK_ERROR right after the kernel call. I have tried putting a breakpoint in the program before the kernel call to look at memory. My debugger (CPU) says that the memory addresses given by the cuda malloc commands are invalid, but I imagine that’s just because it’s a device memory address. Just the same, I tried inserting calls to CUT_CHECK_ERROR after the cudaMalloc and cudaMemcpy calls (already wrapped in CUDA_SAFE_CALL) to no avail.
Does anyone know of a debugger I can use? Should I use more of those macros like CUDA_CHECK_ERROR and CUT_CHECK_ERROR (anyone have any idea where they’re documented??) Do you have other suggestions?
Thanks so much,