You can call CUDA code in a DLL from Fortran the same way you call any C API in a DLL from Fortran.
Here is a very simplistic example. I used the Microsoft C/C++ compiler and Intel Fortran on Windows here. For a real project, you would probably want to read up on how to use ISO C bindings in Fortran.
We have three source files: cuda_dll.cu and cuda_dll.h for building the DLL, and the main program written in Fortan in main.f90 that calls into the DLL.
cuda_dll.cu:
#include <stdio.h>
#include <stdlib.h>
#include "cuda_dll.h"
__global__ void kernel (int n)
{
printf ("kernel: n = %d\n", n);
}
void wrapper (int *n)
{
printf ("wrapper: calling kernel()\n");
kernel<<<1,1>>>(*n);
cudaDeviceSynchronize();
}
cuda_dll.h:
#ifdef CUDADLL_EXPORTS
#define DLLEXPORT __declspec(dllexport)
#else
#define DLLEXPORT __declspec(dllimport)
#endif
extern "C" DLLEXPORT void wrapper (int *n);
main.f90:
program call_cuda
write (*,*) "Calling wrapper from Fortran"
call wrapper (5)
stop
end program call_cuda
Build the DLL (which on Windows results in two files, the actual dynamically linked library with a .dll extension and a stub library with the .lib extension):
nvcc --shared -DCUDADLL_EXPORTS -o cuda_dll.dll cuda_dll.cu
Now use the Intel Fortran compiler, ifort, to build the main program and link to the library. Note that we need to adjust the symbol generation of the Fortran compiler to generate lowercase symbol names to match the symbols exported from the library:
ifort -names:lowercase -Fefortran_main.exe main.f90 cuda_dll.lib
Now run the resulting executable fortran_main.exe, which produces the following output:
Calling wrapper from Fortran
wrapper: calling kernel()
kernel: n = 5