Inter-device copying

Hi there,

I want to copy data from device memory in GPU A, to device memory in GPU B. Is it possible to do this without having to use host memory as a temporary buffer?

I think cudaMemcpyDeviceToDevice is only meant for intra-device copying, but I’m not sure; I suppose CUDA could be smart enough to use different address ranges for each available device and therefore discover that the memory address in the “from” parameter is not of the same device as the address in the “to” parameter. Then again, the address ranges might not fit into the range-descriptor, or CUDA may not be -that- nifty ;).

Can someone enlighten me? :)

Kind Regards,

Frans.

No choice but to copy via the host, I am afraid. Be aware that each GPU context must be held by a separate host thread as well.

No choice but to copy via the host, I am afraid. Be aware that each GPU context must be held by a separate host thread as well.