I am working on a code for integrating N-body system with include openmp and CUDA. I have a machine with 2 Tesla c1060 and 8 cores but I don’t know how to handle to GPUs with openmp. I need to copy an array on each GPU and then, the first GPU calculates the acceleration of the particles from 1 to N /2 and the second GPU from N/2 to N. Please, help me I urgently need to solve this problem.
Have you seen the cudaOpenMP example in the SDK? It’s pretty straightforward.