Hello, in the following code, I use a derived type with allocable members. If not declare created it, there are 3 methods to work out the correct answer:
Method 1: use unstructured data clause, acc enter data and acc exit data.
Method 2: use structural data clause, acc data and acc end data
Method 3: use nothing. The compiler will automatically implicitly copy in what we need.
My first question is : what method is the best and why?
My second question is: How to use the private pointer variables in the kernel? as shown in my code, whether I use private(p_t, p_qv, p_qc) or not, the results are same. Also it looks like the competitive condition which we expect without private(p_t, p_qv, p_qc) has little effects on the performance. In both case, with or without, the compiler will not show that it has realized that we need private pointer variables in the kernel. So the compiler just silently creates private pointers in the kernel and does not let us know this?
My third question or observation is that, the 700 error problem will come when we declare create a derived type variable but do not explicitly copy its allocable members into the device.
Here is my code:
MODULE m_fields
type pair
REAL*8, ALLOCATABLE :: qv(:,:,:)
REAL*8, ALLOCATABLE :: t(:,:,:)
end type
type (pair), target :: pair_t_qv
!!$acc declare create(pair_t_qv)
END MODULE m_fields
program test_pointer
use m_fields
real*8, allocatable,target :: qc(:,:,:)
!!$acc declare create(qc)
real*8, pointer :: p_t => null()
real*8, pointer :: p_qc => null(), p_qv => null()
type (pair), pointer :: p_pair => null()
integer :: i, j, k
integer :: n=100
p_pair => pair_t_qv
ALLOCATE ( p_pair%t(n,n,n) )
ALLOCATE ( p_pair%qv(n,n,n) )
ALLOCATE ( qc(n,n,n) )
!!$acc data
!Method 1: unstructural data region
!!$acc enter data copyin(p_pair)
!!$acc enter data copyin(p_pair%t, p_pair%qv)
!Method 2: structural data region, each of the next line will work
!!$acc data
!!$acc data copyin(p_pair,qc)
!Mehod 3: remove all the data clause, the compiler will implicitly copy what it needs
!$acc kernels
!each of the next two lines will work
!!$acc loop independent collapse(3) private(p_t, p_qv, p_qc)
!$acc loop independent collapse(3)
DO j = 1,n
DO i = 1,n
DO k = 1, n
!p_t => pair_t_qv%t(i,j,k)
p_t => p_pair%t(i,j,k)
p_t = i+j+k
!p_qv => pair_t_qv%qv(i,j,k)
p_qv => p_pair%qv(i,j,k)
p_qv = i*j*k
p_qc => qc(i,j,k)
p_qc = p_t+p_qv
END DO
END DO
END DO
!$acc end kernels
!method 2
!!$acc end data
!method 1
!!$acc exit data copyout(p_pair%t, p_pair%qv)
!!$acc exit data delete(p_pair)
print*, pair_t_qv%t(n,n,n)
print*, pair_t_qv%qv(n,n,n)
print*, qc(n,n,n)
deallocate(qc)
deallocate(pair_t_qv%t)
deallocate(pair_t_qv%qv)
end program
compiled with nvfortran -g -pg -Mlarge_arrays -m64 -Wall -Werror -gpu=ccall,managed,implicitsections -stdpar -traceback -ffpe-trap=invalid,zero,overflow -Minfo=accel -cpp -acc -o test_pointer_4 test_pointer_4ok.f90
Thanks.