MPI in cuda


I am doing a project in implementing beal’s conjecture in cuda by using a cluster that has one frontend and two compute nodes. The frontend is just used for simulation and the two compute nodes have GPU’s inside it and are used to run the cuda program on them. Since we are using cluster we have to write the MPI program also.

Can anyone please suggest me how can I start i.e. should i write program that has an mpi inside the cuda program itself or should i write an seperate prgram for both mpi and cuda. Can anyone please give any links for a tutorial for begginers using mpi in cuda.

We use cuda on red hat linux.

Waiting for your reply.

I’m not sure I understand the question.

Since you can’t call MPI within a CUDA kernel (and I don’t see any reason why one would want to either), you obviously use CUDA kernels inside an MPI program that distributes work within the cluster. I don’t think you need any particular documentation on how to combine the two.

A good introduction into CUDA programming obviously is the CUDA Programming Guide that Nvidia ships together with the CUDA toolkit. For MPI, I found the “MPI Primer / Developing With LAM” from the Ohio Supercomputer Center to be a good starting point (just throw that into Google, you’ll probably find a copy on the web).

Hope that helps.