CUDA debugging issues

vipinsachdeva · March 20, 2008, 8:21pm

I am perplexed about a couple of issues with CUDA: I wanted to know if this is just my experience or others also have faced similiar issues and if there are any workarounds to this. I have noticed that if there is a error with running the CUDA function, the control quickly returns to the host machine in 10s of microseconds or so. Is there any way to capture this error or to find out more about what happened ? for instance I could be copying elements to an array which I have allocated using cudaMalloc and the number of elements are more than the memory allocated or there might be issues related to register utilization for instance, trying to use more registers than is physically present? Has anyone else faced these issues?

The second issue I have is that sometimes the global memory of the GPUs seems like it retains the data from the previous computation: so if in the previous case, the first time I ran the code, it executed allright, and then if I run the program again this time by trying to use more elements than is allocated, then it passes the control quickly to the CPU but the results of the computation still seem to be correct from last time. I am wondering if this is again something that I have only faced for instance. thanks for the replies.

MisterAnderson42 · March 20, 2008, 9:45pm

Use CUT_CHECK_ERROR from cutil or do:

kernel<<<grid,threads>>>();

cudaThreadSynchronize(); // neede because kernel calls are asynchronous

cudaError_t err = cudaGetLastError();

if (err != cudaSuccess)

    // handle error (can use cudaGetErrorString to decode error numbers to strings, though the method might be named differently, look it up)

For perforamance reasons in release builds, you probably only want to check errors in debug mode. CUT_CHECK_ERROR already does this.

vipinsachdeva · March 27, 2008, 4:11pm

I am not sure this always works; for example the same example I gave earlier, I allocated some memory and then in my kernel I access far more elements than I should have, I still get CUDAsuccess for some reason. The most reliable strategy I have worked on so far is that while copying my results array to the GPU memory, I initialize a host array of the same size to 0, and copy the 0s to the GPU memory. Now incase the computation goes wrong, this array then copied to the host memory after completion of the kernel should have meaningful/right values.

Use CUT_CHECK_ERROR from cutil or do:
kernel<<<grid,threads>>>();

cudaThreadSynchronize(); // neede because kernel calls are asynchronous

cudaError_t err = cudaGetLastError();

if (err != cudaSuccess)

    // handle error (can use cudaGetErrorString to decode error numbers to strings, though the method might be named differently, look it up)
For perforamance reasons in release builds, you probably only want to check errors in debug mode. CUT_CHECK_ERROR already does this.

[snapback]346923[/snapback]

MisterAnderson42 · March 27, 2008, 4:29pm

If you write past the end of an array in memory, it is not always caught as an error right away. Usually, a CUT_CHECK_ERROR will return an error during another kernel call long after the one that wrote past the memory.

Errors such as requesting too many registers, too much shared memory, using an unbound texture and many other errors which cause the kernel to return right away will be caught by CUT_CHECK_ERROR immediately.

Topic		Replies	Views
No error for exceeding thread/grid size? CUDA Programming and Performance	0	5232	August 9, 2007
kernel debugging digging in to kernel code for debugging CUDA Programming and Performance	3	7161	May 9, 2007
Strange problem with matrix in Global Memory CUDA Programming and Performance	2	1244	January 6, 2009
Getting around apparent CUDA bugs CUDA Programming and Performance	5	1033	September 20, 2011
blocks bigger than 512 threads I can't see the error CUDA Programming and Performance	2	3963	March 2, 2009
beginners problem - global memory damage? CUDA Programming and Performance	5	1924	September 23, 2008
unknown error from cudaMemCpy Get cuda unknown error for unknown reason CUDA Programming and Performance	9	6325	December 3, 2010
cudaErrorUnknown CUDA Programming and Performance	4	4085	June 1, 2009
Weird error CUDA Programming and Performance	5	2676	August 27, 2007
benchmark data / error handling CUDA Programming and Performance	3	7831	May 23, 2008

CUDA debugging issues

Related topics