Calling functions within device code

crip_crop1 · November 18, 2010, 12:35pm

I’m a little bit confused about how to call a subroutine from a subroutine already on the GPU. I’m porting quite a large bit of code using CUDA Fortran. As with most large codes, there are lots of subroutine calls within the subroutine I am porting.

In the PGI programming guide it seems to contradict itself… it says

A subroutine or function with the device attribute must appear within a Fortran
module, and may only be called from device subprograms in the same module.

but also

A kernel subroutine may not be contained in another subroutine or
function, and may not contain any other subprogram.

So, I’m a little confused as to what I need to do with my subroutines… do they need inlining or can I just stick them all in one module?

If anyone could clarify this for me it would be much appreciated.

Crip_crop

MatColgrove · November 18, 2010, 10:11pm

Hi Crip_crop,

The second quote means that a kernel can not be contained within another subroutine, but it can be contained within a module. The first quote means that when device code calls another device routine, the two routines must be be in the same module. Hence, your module would look something like this:

module mymod

contains

attributes(device) function func1(a,b)

! some code

end function func1

attributes(global) subroutine sub1()

! some code
i = func1(x,y)
! more code
end subroutine sub1

end module mymod

Hope this helps,
Mat

crip_crop1 · November 19, 2010, 2:11pm

That’s really useful, thanks. Wowzer this is going to be one meaty module.

crip_crop1 · September 27, 2011, 4:16pm

Does anyone happen to know the reason why all device subroutines are required to be in the same module?

I’m doing a presentation about my experiences using CUDA Fortran and I’d like to give an explanation for this.

Cheers,
Crip_crop

MatColgrove · September 27, 2011, 8:11pm

Hi Crip_crop,

Does anyone happen to know the reason why all device subroutines are required to be in the same module?

To ensure they can be inlined.

In CUDA C and Fortran, calling subroutines from device code is not supported. While you can organize your code into separate routines, at compile time these routines must be inlined by the compiler.

Hope this helps,
Mat

Topic		Replies	Views
Restrictions on Device Subprograms Legacy PGI Compilers	4	2641	March 30, 2010
Are kernel subroutines must called within a module ? Legacy PGI Compilers	2	1520	March 13, 2018
Confused about calls within device code Legacy PGI Compilers	3	577	May 26, 2020
Device routines must be in the same module as the caller Legacy PGI Compilers	4	2247	May 6, 2013
cuda fortran subroutines and modules Legacy PGI Compilers	2	2079	January 21, 2013
calling a GPU kernel subroutine from a global subroutine Legacy PGI Compilers	6	3907	July 9, 2015
doubt about attributes(global/device) Legacy PGI Compilers	6	4259	December 5, 2019
CUDA Fortran kernel invoke a CUDA C kernel Legacy PGI Compilers	3	4080	October 18, 2010
Bessel functions in CUDA Fortran Legacy PGI Compilers	4	5465	July 25, 2014
Question on executing subroutines in a loop on the GPU Legacy PGI Compilers	3	2025	March 19, 2019

Calling functions within device code

Related topics