Calling a device subroutine from a global subroutine.


How can I call another device sub from a global sub? what things I need to take into account?

Please provide a simple example.


Hi Dolf,

You just need to decorate the subroutine with “attributes(device)” then call it like any other routine (except only from device code).

I’d avoid using automatics in device routines. They’re supported but require every thread to allocate memory which can hurt perfomance.

You can’t use the “SAVE” attribute or use data initialization since this causes the variable to have host global storage, which isn’t available on the device. Also, you can’t have optional arguments and the device subroutine can’t be contained in a host routine nor can it contain routines.

While not absolutely required in all cases, you should have interfaces to the device routines, either explicit or implicitly by putting the routine in a module.

By default, you can call device routines found in external modules. However, if you compile with RDC (-Mcuda=nordc), you can only call device routines found in the same module as the caller.

There’s a simple example of using a device routine in the “sgemm” SDK example that ships with the compilers.

Hope this helps,

MODULE saxpy_sgemm

  attributes(device) subroutine saxpy16(a, b, c)
    real, device :: a
    real, dimension(16) :: b
    real, device, dimension(16) :: c
    c = c + a * b
  end subroutine

  attributes(global) subroutine sgemmNN_16x16(a, b, c, m, n, k, alpha, beta)
    real, device :: a(m,*), b(k,*), c(m,*)
    integer, value :: m, n, k
    real, value :: alpha, beta

    real, shared, dimension(17,16) :: bs
    real, device, dimension(16) :: cloc

    inx = threadidx%x
 .... cut ...
      do j = 1, 16
        call saxpy16(a(ia,ik+j-1), bs(1,j), cloc)
      end do
... cut ...