Device functions in separate modules

Hello,
I have my in-house CFD code written in Fortran, the code is parallelized with OpenMP. I would like to parallelize this code using PGI CUDA Fortran, but the greatest disadvantage of CUF I found, is that all device functions calling each other must be contained in a single module, which would affect structure of my code. Therefore, my question is:
Will Cuda Fortran support device functions placed in separate modules and/or nested modules?

Hi Folo,

Will Cuda Fortran support device functions placed in separate modules and/or nested modules?

The problem is that NVIDIA does not currently have a linker for device code. Hence, there is no way to associate symbols in one module with symbols found in another. If/When NVIDIA adds a linker, then we should be able to have CUDA Fortran call external device subroutines.

Note that your host code can call global device subroutines from multiple modules. Hence, the various device modules can share device data provided that the host code passes the device pointers from one routine to another.

  • Mat