calling a GPU kernel subroutine from a global subroutine

Hi group,

how can I call a subroutine (GPU kernel) from a global (GPU kernel) subroutine? what things I need to be aware of?

I remember doing so a while back but it was complaining that it’s illegal operation to do. I might missed something.



Hi Dolf,

Please see section 2.5.3 and 2.5.4 of the CUDA Fortran Users Guide:

As for the illegal operation, I’m not sure. If you encounter it again, post an example and we can determine the issue.

  • Mat

Hi Matt,

thanks for the reply. I have read the user manual. there are some restrictions on using such techniques. Can you post an example to better understand?
say we have a global subroutine (that have been called from host subroutine using chevrons). In that global subroutine there is a line which calls a device subroutine. You can show us what restrictions needs to be avoided.


Hi Dolf,

Other than what’s in the user’s guide, I don’t have a list of restrictions. Other than the normal device code restrictions, there not much difference than what can be done in the “global” versus what can be done in “device”. Is there a specific issue that you’re encountering?

For examples, we have many general CUDA Fortran examples in “$PGI/
pgi/linux86-64/2015/examples/CUDA-Fortran/”. Though, it appears that we don’t have much for “device” routines except for the trivial example in “SDK/sgemm/sgemm.cuf”.

  • Mat

I am using windows version of PVF 15.3. I can’t see the folder location you are referring to.


Try: C:\Program Files\PGI\win64\2015\examples\CUDA-Fortran

nothing there.