cudaGetLastError returns a strange error

thuszar1234 · May 23, 2018, 6:03am

Hello,

in my code, I have a similar sequence:

// do some stuff, launch kernels, etc

res = cudaDeviceSynchronize();
// check res

res = cudaGetLastError();
// check res

All calculations are done on the default stream and one thread.

The cudaDeviceSynchronize returns cudaSuccess, but the cudaGetLastError call returns an invalid device function error.

Should this be possible according the CUDA API specification?

I mean the sync call should wait until the device is finished, so no errors should be emitted between those two code lines (once again, assuming a single threaded app).

How can this happen?

Robert_Crovella · May 23, 2018, 9:48am

Yes, its possible.

cudaDeviceSynchronize() returns the error code/result from the actual synchronization process, as well as any previous asynchronous errors. The invalid device function error is not an asynchronous error. It is an error that is discoverable/reportable at the moment the kernel launch is issued, not an error that results from kernel execution. It is also a non-sticky error, i.e. an error that does not “corrupt” the CUDA context, therefore it is not reported via ordinary API activity, but is reported via cudaGetLastError.

If you used proper CUDA error checking, you would discover this error before getting to the cudaDeviceSynchronize function.

thuszar1234 · May 23, 2018, 10:03am

Thank you!

“It is also a non-sticky error”
“an asynchronous error”

Is there a document that goes deeper into CUDA error handling? I don’t remember these terms from the CUDA programming guide and also the API documentation does not seem to mention them.

“cudaDeviceSynchronize() returns an error if one of the preceding tasks has failed.” (CUDA Runtime API :: CUDA Toolkit Documentation)

From the above, I’ve got the impression that cudaDeviceSynchronize() should ‘catch’ all previous errors.

Robert_Crovella · May 23, 2018, 10:36am

I can’t point you to a single concise reference that covers these topics. There are various questions/answers on Stack Overflow which cover responses to these types of questions (and perhaps probably here on devtalk.nvidia.com). Here is one such example on SO:

[url]States of memory data after cuda exceptions - Stack Overflow

cudaGetLastError() or cudaPeekAtLastError(), which is/are referred to in many treatments of “proper CUDA error checking”, should catch any previous error, whether synchronous or asynchronous, sticky or non-sticky. The same cannot be said for cudaDeviceSynchronize(). This is easy to prove with a simple test case.

Topic		Replies	Views
cudaDeviceSynchronize not returning error of type "invalid configuration argument" CUDA Programming and Performance	2	1685	March 12, 2014
Got wrong result when not using cudaDeviceSynchronize in threads CUDA Programming and Performance	6	838	February 1, 2024
CUDA errors: determine "sticky-ness" CUDA Programming and Performance cuda	9	1192	November 3, 2023
cudaGetLastError() for asynchronous calls CUDA Programming and Performance	1	2719	August 17, 2015
cudaDeviceSynchronize error CUDA Programming and Performance	2	3819	February 17, 2014
Does cudaDeviceReset() wait for operation completion on the device? CUDA Programming and Performance	5	745	December 27, 2023
cudaSynchronizeDevice() returns error code 6 CUDA Programming and Performance	7	8601	June 16, 2011
cudaDeviceSynchronize doesn't work if the kernel function takes too long to complete CUDA Programming and Performance	3	8628	January 29, 2012
Invalid Device Ordinal CUDA Programming and Performance cuda	2	487	August 12, 2024
Synchronization synchronizing a n body problem. CUDA Programming and Performance	8	4302	September 22, 2009

cudaGetLastError returns a strange error

Related topics