Hi,
I have accelerated my Fortran code using OpenACC. For K20 with 4GB memory, I can simulate ~ 1 million particles.
My problem now is how to simulate more particles using two K20.
In my code, I have a big matrix, the matrix is calculated before OpenACC parallelization, and the matrix remains constant till the end of the simulation.
I copyin the matrix to device and used data region to store it in device.
To use two GPUs, I plan to transfer first half of the matrix the first GPU and last half matrix to the second GPU. No data exchange is needed between two GPUs. I have read several tutorials on multiple GPUs, but I didn’t find examples to show how to use data region for multi-GPUs. Does anyone know how to do this ?
Thanks,
GZ