undefined references for global functions

I’m trying to compile the very basic examples from the pgifortran documentation:

! Kernel definition
attributes(global) subroutine ksaxpy( n, a, x, y )
real, dimension(*) :: x,y
real, value :: a
integer, value :: n, i
i = (blockidx%x-1) * blockdim%x + threadidx%x
if( i <= n ) y(i) = a * x(i) + y(i)
end subroutine

! Host subroutine
subroutine solve( n, a, x, y )
real, device, dimension(*) :: x, y
real :: a
integer :: n
! call the kernel
call ksaxpy<<<n/64, 64>>>( n, a, x, y )
end subroutine

But that only gives me the error message:

undefined reference to `ksaxpy_'

I have no idea how to get rid of it. When trying to write my own gpu subroutines the same thing happens. Calls to host functions work just fine but as soon as I add “attributes(global)” to any subroutine the undefined reference error pops up.

Hi Pschmidt,

To call a a kernel from host code, the global subroutine needs to have an explicit interface or be placed in a module where an implied interface is created. The example given isn’t meant to be compiled directly. However, since you’ve had troubles with it, I’m sure others will as well, so I’ll ask the example be put into a ‘compilable’ form.

Thanks,
Mat

module saxpy

contains

! Kernel definition
attributes(global) subroutine ksaxpy( n, a, x, y )
real, dimension(*) :: x,y
real, value :: a
integer, value :: n, i
i = (blockidx%x-1) * blockdim%x + threadidx%x
if( i <= n ) y(i) = a * x(i) + y(i)
end subroutine

! Host subroutine
subroutine solve( n, a, x, y )
real, device, dimension(*) :: x, y
real :: a
integer :: n
! call the kernel
call ksaxpy<<<n/64, 64>>>( n, a, x, y )
end subroutine

end module saxpy

program foo
   use saxpy
   ... 
end program foo

Thank you very much, that works.
Still I don’t fully understand why this is necessary. Is there a reason why gpu functions have to be in a module? How would other practical implementations look like? It just doesn’t seem very intuitive to me, I’m wondering if there is some kind of reasoning behind it I don’t understand yet.

Is there a reason why gpu functions have to be in a module?

They don’t need to be in a module. However, they do need an interface. Module’s provide implicit interfaces thus saving you the work of having to write interface blocks in your host code.

I also recommend user to use modules since module data can be accessed by any routine in your module. Modules help with type checking, data encapsulation and code reuse.

How would other practical implementations look like?

This might not be exactly what you want, but this tutorial I wrote give and example implementation. http://www.pgroup.com/lit/articles/insider/v2n1a4.htm


Hope this helps,
Mat