I’m calling my kernel with one block, and a user selectable number of threads. If I do any number of threads from 1 to 192, everything works as it should, but if I give it 193, I get back an error of CUDA_ERROR_INVALID_IMAGE. I haven’t been able to find in the documentation or by googling what that error actually indicates. What’s the general interpretation of that code?
the kernel call and error code setting is as follows:
Sanders<<<1,probSize>>>(fex, _Dneq, _Dy, _Dt, _Dtout, _Ditol, _Drtol, _Datol, _Ditask, _Distate, _Diopt, _Drwork, _Dlrw, _Diwork, _Dliw, jex, _Djt, _Dcommon, _Derr,probSize); error = cudaGetLastError(); error2 = cudaThreadSynchronize();
where probSize is set by the user at run time. This is running on a Tesla cluster node, and it only copies a small ammount of data, so I’m skeptical that I might be running out of memory.