CUDA + FORTRAN ?

I have been using CUDA for > 3 months now for accelerating various application. Now am planning to implement CUDA to my main stream research work ( space trajectories and all…) but the problem is NASA and all the Astrodynamics community prefers using FORTRAN ( due to its compatibility, existing codes, familiarity and speed).

My professor is interested that I should work with CUDA but a Fortran implantation is very important specially in our scientific community to make my work appeal to my research community. External Image

Please anyone (especially people at Nvidia) can just tell us a rough time line… when a Fortran compatible CUDA version will be available? I know there has been talk about its coming … but when ?

I have heard there are some wrapper packages… which can do a F - C- CUDA link… but what performance degradation you have and how much harder is it to program them ?

Thanks… all External Image

flagon it is called if memory serves me well

Or you can always just write your performance critical kernels in C, and link and call them from fortran. There is no inherent overhead in this operation: for example many BLAS libraries are now compiled C code that you can call from fortran.

As for when CUDA/fortran from NVIDIA is coming, who knows. It was on the milestone list at NVISION 2008, but I don’t recall where on the list it was: sorry.

Linking CUDA code with Fortran (CUDA being called from Fortran) is basically the same as calling C/C++ from Fortran.

You need to make sure the CUDA entry points (host side) are declared as “__stdcall” (for instance) so that they are used as C entry points.

Then you will need to make sure your entry points names are named according to Fortran calling conventions (underscores, capitalization, …) which depend on your compiler tools.

Jup, exactly. Calling CUDA from Fortran is definitely the same thing as calling C from Fortran. Make sure that you properly #extern “C” the interfaces, “nm” the object files just to make sure the symbol tables match (no idea of the equivalent of nm in the Windows world), and things will work right out of the box.
Compiling your CUDA stuff into a static library is “cleaner”, but essentially the same thing. Same thing as in calling the GotoBLAS from Fortran for instance.

The other way round is the tricky part. Imagine a parallel Fortran environment, with some well-established error handling code in Fortran. You might want to call this code from CUDA in case CUDA wrecks havok, e.g. error codes returned from launching kernels in the driver API, gudaGetError() or whatever it is called in the runtime API, or some return code from a CUBLAS/CUFFT call. Fortran2003 apparently defines nice interfaces, if you can’t program in F03 for some legacy reason just like me, the result will be hacky. Some compilers (PGI, SunStudio, the NEC SX9 compiler) even issue symbol tables for Fortran modules (akin to C++ “objects”) that can’t be called from C because the stupid compilers insert dots in the symtables, and dots cause C compilers to commit suicide. Some wrapping is needed here, code that in general is not fun to write.

If you use Intel fortran

C /C++ can access the fortran modules directly.

It is very convinent to mix C / Fotran programing

That’s a little bit deceptive as one must transfer the data to and from GPU within the function call. When choosing this approach, it’s really important that the performance gain outweighs the transfer overhead.

Of course. Such considerations apply all CUDA applications.

When I made that statement I was only referring to the Fortran/C integration. The binding is done at a binary level using the same function call semantics. So there is no overhead calling a C function from FOTRAN than there is calling a FORTRAN function from FORTRAN.