Trying to use device array as a derived type member in a kernel

physicswjy · November 15, 2020, 4:48pm

Hi. I’m new to cuda fortran and trying to compile the following code using the nvfortran compiler:

module defvars
use cudafor
implicit none

integer, parameter :: n = 100

type Type1
    integer, allocatable :: a(:)
    integer, allocatable, device :: a_dev(:)
end type Type1

type(Type1) :: T1

end module defvars


module calc
use cudafor
use defvars
implicit none

contains
attributes(global)  subroutine kernel()

    integer :: i

    i = threadIdx%x
    T1%a_dev(i) = i

end subroutine kernel

end module calc


program main
use defvars
use calc
use cudafor
implicit none

allocate(T1%a(n))
allocate(T1%a_dev(n))

T1%a = 0
T1%a_dev = T1%a

call kernel<<<1,n>>>()

T1%a = T1%a_dev

print *, T1%a

end program main

A error message pops out as : “NVFORTRAN-S-0520-Host MODULE data cannot be used in a DEVICE or GLOBAL subprogram”.

I’m trying to copy a array in a host derived type to the device, then use this device array in a kernel subroutine. I understand that this can be done by passing the device array and its length as arguments of the kernel. However, since in my real application there are many arrays to be used in a single kernel, I want to warp the array(s) in a derived type and used without specified in the kernel argument to make my code simple and organized. Can it be achieved?

Thanks!

bleback · November 16, 2020, 5:25pm

What is the benefit of putting them in a derived type? Can they just be declared in the module scope outside of a derived type? The compiler has to get to a_dev through T1, on the device, so it is difficult to set that up on the host.

module defvars
use cudafor
implicit none

integer, parameter :: n = 100

integer, allocatable :: a(:)
integer, allocatable, device :: a_dev(:)

end module defvars

If you must use a derived type, you can perhaps experiment with making the top-level derived type managed.

physicswjy · November 17, 2020, 12:26am

Thank you for your reply. I understand now that it is difficult to put device arrays in a host derived type, so I will simply put them outside.

MatColgrove · November 17, 2020, 4:08pm

You might want to consider using OpenACC for this case. For example:

% cat test.f90
module defvars
implicit none
integer, parameter :: n = 100
type Type1
    integer, allocatable :: a(:)
end type Type1
type(Type1) :: T1

end module defvars

module calc
use defvars
implicit none

contains
subroutine kernel()

    integer :: i
!$acc parallel loop present(T1)
    do i=1,n
       T1%a(i) = i
    enddo

end subroutine kernel
end module calc

program main
use defvars
use calc
implicit none

allocate(T1%a(n))
T1%a = 0
!$acc enter data copyin(T1,T1%a(1:n))
call kernel()
!$acc update self(T1%a(1:n))
print *, T1%a
!$acc exit data delete(T1%a,T1)

end program main
% nvfortran -acc -Minfo=accel test.f90 -V20.9; a.out
kernel:
     19, Generating present(t1)
         Generating Tesla code
         20, !$acc loop gang, vector(100) ! blockidx%x threadidx%x
main:
     34, Generating enter data copyin(t1,t1%a(1:100))
     36, Generating update self(t1%a(1:100))
     38, Generating exit data delete(t1%a(:),t1)
            1            2            3            4            5            6
            7            8            9           10           11           12
           13           14           15           16           17           18
           19           20           21           22           23           24
           25           26           27           28           29           30
           31           32           33           34           35           36
           37           38           39           40           41           42
           43           44           45           46           47           48
           49           50           51           52           53           54
           55           56           57           58           59           60
           61           62           63           64           65           66
           67           68           69           70           71           72
           73           74           75           76           77           78
           79           80           81           82           83           84
           85           86           87           88           89           90
           91           92           93           94           95           96
           97           98           99          100

physicswjy · November 18, 2020, 12:59pm

Thanks Mat, I’ll give it a try.