Distribution of Fortran code in every GPU core Raw beginnings on GPGPU

Hello to everyone,

Me and my colleagues have a simulation program that requires much process power and that must be run each time for each unit of the whole simulation. A complete simulation has about a million units, i.e, the program runs about a million times.

Recently we thought about moving this project to the GPU world, beginning with CUDA. But as the program is really big, we were wondering if (not as a final solution, of course) there is the possibility of sending the code to the GPU for execution in every core. The code is in Fortran and takes about 2h in an average CPU to run each time.

That being said, I would like to know if anyone is available to help me find some references that can give me information about the possibility (or impossibility) of doing this.

I hope I explained everything right.

Your best option probably is PGI’s Accelerator Fortran, which brings Fortran code to the GPU by inserting compiler directives into the code.

PGI also offers a CUDA Fortran compiler which is analogous to Nvidia’s CUDA C compiler (and made Nvidia rename CUDA to CUDA C).

In principle you could also use a Fortran to C converter like f2c and then rewrite the result to CUDA C. But this is tedious, requires knowledge of all of Fortran, C, f2c, and CUDA, and generates ugly and hard to read code.

Thank you for your prompt response!

I had heard of the PGI compiler and I will learn more about it.

Nevertheless, I think I should rephrase my question to a more conceptual one: is it possible to have a code (written in C or Fortran), make it run in every core of the GPU and then get its results without having to alter it?

No. The minimum you have to do is to annotate the code with directives for the PGI Accelerator Compiler.

But that code will then still run on the CPU as well, so you do not need to maintain two separate versions for CPU and GPU.