so my cuda program is aborting with “Runtime API error 4: unspecified launch failure.” and I am using cuda-memcheck to debug it.
At first it was odd because of 4 different parameters that crash, cuda-memcheck does not find any issues. Then I found another set of parameters that do crash and where cuda-memcheck reports
When I check the line it seems odd that the crash would occur here, because it only references variables that are on the stack and not dynamically allocated.
Further debugging shows me that cuda-memcheck always reports the same line, even when I insert new debugging statements earlier in the code. That just doesn’t make sense to me though; I can clearly see that the program is the newly compiled version because I can see my debug output.
What am I doing wrong? Is cuda-memcheck really so inaccurate?
P.S. I am using cuda 4.0 on a geforce gtx 460 under Linux.