I am using Totalview to debug some software. It, I assume, is using cuda-memcheck type functionality to check for memory errors.
the debugger has stopped in one of my kernels with an “Lane User Stack Overflow”
What is this implying? the line that it stops on is:
d_usKernel[i*cols + j] = cuCmulf(d_usKernel[i*cols + j], d_rr[i*cols + j]);
with i and j being 863 and 57 respectively, and these arrays are 1025x1024 rowsxcols
So what should I be looking into to fix this problem?