cudaMemcpy returning cudaSuccess even when the copy has not been performed

According to the manuals it should not be possible to access to the memory on the device allocated from another thread.
Indeed this is the case, but despite copies are not possible and are not performed, cudaMemcpy returns cudaSuccess and not cudaErrorInvalidDevicePointer as should.
I’ve seen that other cudart API are capable of detecting that the memory has been allocated by another thread and return cudaErrorInvalidDevicePointer (like cudaMemset, see the topic the misdetection of the error is a bug that may be resolved in further releases?

In the attached example it is shown that despite the copy has not been properly performed cudaSuccess is returned.

cudaMemcpyTestCase.cpp (3.55 KB)


Can someone verify this ?

This is weird,

Well, first of all, you should change

CUDA_SUCCESS( cudaMemcpy(myDeviceInMemory, myHostInMemory, theSize, cudaMemcpyHostToDevice) );
CUDA_SUCCESS( cudaMemcpy(myDeviceInMemory, myHostInMemory, theSize*sizeof(float), cudaMemcpyHostToDevice) );

When running your code as is, I get:

Performing the copy on the main thread.
Performing the copy on different thread.
Error! expected: 0 measured 2.35099e-38

When Inserting a simple printf statement in

#define CUDA_SUCCESS( condition ) {
fprintf (stderr,“c = %d\n”,condition);
cudaError_t myError = (condition);
if( myError != cudaSuccess ) {
printf( “\n FAILURE in %s, line %d\n”, FILE, LINE );
printf( “Error: %s\n”, cudaGetErrorString(myError));
} }

I get:

c = 0
c = 0
Performing the copy on the main thread.
c = 0
c = 0
c = 0
c = 0

FAILURE in src/cudaMemcpyTestCase.cpp, line 46
Error: invalid device pointer