Process A: cuMemcpy2D(srcA, dstA) (device to device copy) → cuIpcGetMemHandle (dstA of previous cuMemcpy2D) → send handle to Process B over pipe
Process B: cuIpcOpenMemHandle (from pipe) → cuMemcpy2D (dstA, dstB) (device to device copy)
Here dstA is a device memory as a temporary transfer memory for IPC.
Assuming this is on the same GPU with cuStream set to NULL, is it guaranteed that cuMemcpy2D(dstA, dstB) will happen after cuMemcpy2D(srcA, dstA) even when they are called from two different processes in two different contexts?
Will dstB be guaranteed to have content of srcA?