I have a question about data copy between multi-GPUs.
I implemented some program with CUDA.
Now I am trying to implement the program using multi-GPUs to improve performance.
I didn’t start implemtenting with multi-GPUs yet. So I don’t know well.
I was wondering about how to copy data between devices directly.
I am going to use two GPU devices.
My program that I am trying to implement has to share data between devices during process.
I want to know how to copy data device to the other device directly in the middle of process.
The ‘process’ means process in host, not process in kernel.
Is that possible?
If so, is there any example source code?
please let me know how to do it.
Thanks in advance.