I’m a little bit confused about how to call a subroutine from a subroutine already on the GPU. I’m porting quite a large bit of code using CUDA Fortran. As with most large codes, there are lots of subroutine calls within the subroutine I am porting.
In the PGI programming guide it seems to contradict itself… it says
A subroutine or function with the device attribute must appear within a Fortran
module, and may only be called from device subprograms in the same module.
but also
A kernel subroutine may not be contained in another subroutine or
function, and may not contain any other subprogram.
So, I’m a little confused as to what I need to do with my subroutines… do they need inlining or can I just stick them all in one module?
If anyone could clarify this for me it would be much appreciated.
Crip_crop
Hi Crip_crop,
The second quote means that a kernel can not be contained within another subroutine, but it can be contained within a module. The first quote means that when device code calls another device routine, the two routines must be be in the same module. Hence, your module would look something like this:
module mymod
contains
attributes(device) function func1(a,b)
! some code
end function func1
attributes(global) subroutine sub1()
! some code
i = func1(x,y)
! more code
end subroutine sub1
end module mymod
Hope this helps,
Mat
That’s really useful, thanks. Wowzer this is going to be one meaty module.
Does anyone happen to know the reason why all device subroutines are required to be in the same module?
I’m doing a presentation about my experiences using CUDA Fortran and I’d like to give an explanation for this.
Cheers,
Crip_crop
Hi Crip_crop,
Does anyone happen to know the reason why all device subroutines are required to be in the same module?
To ensure they can be inlined.
In CUDA C and Fortran, calling subroutines from device code is not supported. While you can organize your code into separate routines, at compile time these routines must be inlined by the compiler.
Hope this helps,
Mat