Hi, after a discussion in the fortran-lang discourse group regarding function inlinement on GPUs I tried to see wether it would be possible to inline a C++ function within a Fortran program that will be executed on the GPU. In that link I show how I took the saxpy example and modified it to call a cpp implementation and compare the execution times.
I found that the execution was indeed much slower compared to the pure fortran implementation, but, as mentioned also there, when I read in the nvidia HPC manuals regarding inlinement I see no explicit limitation to exploit inlinement for a cross-language application. Is it possible to do? and if so, would it be possible to give some hints on how to manage doing so?
This is because the C code isn’t getting vectorized due to the potential aliasing. Either add the “restrict” keyword or the flag “-Msafeptr” to assert to the compiler that there is no aliasing.
Is it possible to do?
Cross language calling is certainly fine, but cross-language inlining isn’t something we support.