synchronizing on peer memory copies


Looking at the peer memory copy api it is not clear to me how to properly synchronize on both devices. In particular, when I issue a copy from one device to another, I would like to track the completion of copy on destination device (so that I can start reading) and in addition tracking completion on source device is also important so that I don’t prematurely overwrite the source buffer. cuMemCpyPeerAsync only takes one stream as a parameter, which allows me to track completion either on source or on receiver, but not both.


Use a stream to manage the activity on the device where the stream is local.
Use an event on the other device.

Hi Robert, thanks for your reply.
According to cuda documentation of cuRecordEvent, hEvent and hStream must be from the same context. How do I queue triggering of an event on another device upon copy completion ?

launch a cudaStreamWaitEvent on the event. The cudaStreamWaitEvent can be launched into any stream.

yes, thanks, and another most important feature for me which I have just discovered (thnaks to you) is that cudaStreamWaitEvent can wait on events from any context. Now things suddenly make sense and I can connect the dots finally. Thanks for the tip!