;-) Hello .
As a bypass to problem with pointer and OpenAcc I try “associate” statement .
I doesn’t work a all , with or without OpenAcc directives .
I’m using the last pgi/13.10, but all give the same error .
Here is an example :
PROGRAM test_associate
IMPLICIT NONE
INTEGER ,PARAMETER :: N=10
INTEGER :: IX = 1 , IY = 2
REAL, ALLOCATABLE, DIMENSION(:,:,:,:) :: POOL
!$acc declare mirror (POOL)
ALLOCATE(POOL(N,N,N,2))
ASSOCIATE ( X => POOL(:,:,:,IX), Y => POOL(:,:,:,IY) )
!$acc kernels
X = 1.0
Y = 2.0
!$acc end kernels
!$acc update host (X,Y)
PRINT*, "X=", X(N,N,N)
PRINT*, "Y=", Y(N,N,N)
END ASSOCIATE
!$acc update host (POOL)
PRINT*, "X/POOL=", POOL(N,N,N,IX)
PRINT*, "Y/POOL=", POOL(N,N,N,IY)
END PROGRAM
With ifort/gfortran4.7 no problem , as expected , X=1 & Y=2 :
gfortran test_associate.f90 -o test_associate
test_associate
X= 1.00000000
Y= 2.00000000
X/POOL= 1.00000000
Y/POOL= 2.00000000
With pgi
First the Compilation giving some strange Warning on the use of the X & Y aliases :
pgf95 -ta=host,nvidia,cuda5.5,kepler -Minfo=accel test_associate.f90 -o test_associate
PGF90-W-0155-The number of subscripts is less than the rank of pool (test_associate.f90: 18)
PGF90-W-0155-The number of subscripts is less than the rank of pool (test_associate.f90: 19)
0 inform, 2 warnings, 0 severes, 0 fatal for test_associate
test_associate:
13, Generating NVIDIA code
Generating compute capability 3.0 binary
14, Loop is parallelizable
Accelerator kernel generated
14, !$acc loop gang, vector(4) ! blockidx%y threadidx%y
!$acc loop gang, vector(32) ! blockidx%x threadidx%x
17, Generating update host(y(:,:,:,:))
Generating update host(x(:,:,:,:))
21, Generating update host(pool(:,:,:,:))
And the results are wrong , on the NVIDIA gpu but also on the HOST
ACC_DEVICE=NVIDIA test_associate
X= 0.000000
Y= 0.000000
X/POOL= 1.000000
Y/POOL= 2.000000
ACC_DEVICE=HOST test_associate
X= 1.000000
Y= 1.000000
X/POOL= 1.000000
Y/POOL= 2.000000
Bye
Juan