Copying from GPU0 to GPU1 is there a way to do it without a host?

Hi. I’ve read in older posts that there is no way to copy data from one GPU to another.
cudaMemCpy(…,cudaMemCpyDeviceToDevice) is said to be for copying within single device.

Is there some development in this area, say in CUDA 3.0?

Thank you.

I don’t believe anything changes in that area.