Return error codes from previous, asynchronous launches

What does this means ? Some errors were generated in other functions and not reported or the error “stack” is not cleared from previous error ?
In my program, I know that the function reporting the error is not the cause of the error. How can I find where it is generated if it is not reported?
cuda-gdb and gdb are of limited help here because the errors happended after few 100,000 iterations and it will take much too long to run.

It means that asynchronous calls can go into error without returning the error to the caller, in which case the error is “held” by the context until the next status returning call is made. Personally, I favor using explicit calls to cudaGetLastError() after asynchronous operations to check for errors.