According to the manuals it should not be possible to access to the memory on the device allocated from another thread.
Indeed this is the case, but despite copies are not possible and are not performed, cudaMemcpy returns cudaSuccess and not cudaErrorInvalidDevicePointer as should.
I’ve seen that other cudart API are capable of detecting that the memory has been allocated by another thread and return cudaErrorInvalidDevicePointer (like cudaMemset, see the topic http://forums.nvidia.com/index.php?showtopic=98820): the misdetection of the error is a bug that may be resolved in further releases?
In the attached example it is shown that despite the copy has not been properly performed cudaSuccess is returned.