I get CUDA Exception: Device Illegal Address, could this be due to a recursion within the kernel?
I have recursive factorial computation and recursive combination finding functions within the kernel. I also get the following warning while compiling.
ptxas warning : Stack size for entry function '_Z10compute_piPfS_iii' cannot be statically determined
Could this be the reason for “CUDA Exception: Device Illegal Address”. How could I more precisely identify the reason for this error?
CUDA_EXCEPTION_10, Device Illegal Address is caused by a memory access in your kernel that is targeting an address that does not correspond to a valid page (the address is bogus).
Typically, if recursion exhausts certain limits (such as the user data stack), you will encounter a different exception type (Lane User Stack Overflow or Warp Out-of-range Address).
To increase precision of where the error occurs:
(1) While a device illegal address exception is imprecise, you may still be able to figure out where it’s coming from. Where does it say execution has stopped in your kernel due to the exception? Is it near a memory access in your code?
(2) Can you try running with the ‘set cuda memcheck on’ command in cuda-gdb? This will enable cuda-memcheck integrated mode, and should convert your exception into CUDA_EXCEPTION_1, Lane Illegal Address. This will stop your program precisely where the error occurred, and you should be able to print out program state to determine what the cause is.
Also, which GPU are you running on?