Why is module necessary for GPGPU code?

I notice for all GPGPU code, whether be it Cuda Fortran or ACC, for the code to compile, it is necessary to put subroutines and functions into a module. Why is it so ?

Chee Choung

That is not true, but it is the most convenient because interfaces need to be explicit.

attributes(device) function iii(i)
integer, device :: i
iii = i - 1
end function

attributes(global) subroutine jjj(x)
interface
attributes(device) function iii(i)
integer, device :: i
end function
end interface
integer, device :: x(*)
i = threadIdx%x
x(i) = iii(i)
end subroutine

program main
use cudafor
interface
attributes(global) subroutine jjj(x)
integer, device :: x(*)
end subroutine
end interface
integer, managed :: x(128)
call jjj<<<1,128>>>(x)
istat = cudaDeviceSynchronize()
print ,sum(x),(127128)/2
end