CUDA+MPI. Are they compattible in PGI Fortran?

I am writing my master thesis and I am new to this hot mix=))
Anyone knows how write such programs?=) And can I mix these?))

I need some (for example 2 CPU) which receive tasks independently and send small grained parts to their CUDA devices…

It’s only dream but who knows who knows…=)

Thanks a lot for your interest even you don’t know what to talk to me=)

CUDA Fortran and MPI are indeed compatible. In fact, if you want to use multiple GPUs, you have to use MPI or OpenMP or some similar technology to control what CPUs talk to what GPUs.

In my work, I use MPI to control 2 CPUs and 2 GPUs, all the way up to 8 CPUs and 4 GPUs. (NB: The latter isn’t ideal, but it works!) And with MPI it’s fairly simple. A skeleton method is:

    STATUS = cudaGetDeviceCount(num_devices)

    devicenum = mod(rank, num_devices)

    STATUS = cudaSetDevice(devicenum)

Obviously, you should test STATUS appropriately, but this is the gist of it. Find out how many devices there are, do a mod of the rank to that number, and set the device to that. So, rank 0 gets GPU 0, rank 1 GPU 1, etc.

Thanks a lot for your help=)
Only one question. What I should do to compile this code? (could you give me any sample program with direct project properties in PVS)?

Thanks a lot for reading my topic))

Hi tereshin,

You can find information about using MPI within PVF in Chapter 4 of the PVF User’s Guide.

I don’t have an example of using CUDA Fortran with MPI in PVF. However, I’m just started writing an article for the PGInsider newsletter regarding programming for multiple GPUs. I’ve mainly focused on OpenMP but will try to add an example on using MPI as well.

  • Mat

The function cudaGetDeviceCount() works, but “cudasetdevice()” fails to select the right device for me.

info=cudasetdevice(2)

I check the value of “info” which is the return value after call the cudasetdevice() function. The value is 36, which means runtime_error(36).

Do you have sample code of MPI+CUDAFORTAN which works?

Well, the first thing is to figure out what this error is. The “good” way to call many CUDA functions is:

    STATUS = cudaSetDevice(devicenum)
    if (STATUS /= 0) then 
       write (*,*) "cudaSetDevice failed: ", cudaGetErrorString(STATUS)
    end if

Using cudaGetErrorString, you’ll get the readable error.

My first question is how many devices do you have? If you have only 2, their device numbers will be 0 and 1 a la MPI, not 1 and 2. So if you try to set to a device that doesn’t exist, I imagine an error will throw.

Let us know what the Error String is for that error.

Matt