Copy data from device to another device multiple GPUs

Hi! I’m trying to run some simple tests allocating and copying data.
How can I copy data across two different devices without using the host memory? Is there a way to do it or will the host be necessary for this operation?

I have written a dual tesla application. You have to copy the data from a src device, to host memory, and then to the dst device.