;-) Hello .
As a bypass to problem with pointer and OpenAcc I try “associate” statement .
I doesn’t work a all , with or without OpenAcc directives .
I’m using the last pgi/13.10, but all give the same error .
Here is an example :
PROGRAM test_associate IMPLICIT NONE INTEGER ,PARAMETER :: N=10 INTEGER :: IX = 1 , IY = 2 REAL, ALLOCATABLE, DIMENSION(:,:,:,:) :: POOL !$acc declare mirror (POOL) ALLOCATE(POOL(N,N,N,2)) ASSOCIATE ( X => POOL(:,:,:,IX), Y => POOL(:,:,:,IY) ) !$acc kernels X = 1.0 Y = 2.0 !$acc end kernels !$acc update host (X,Y) PRINT*, "X=", X(N,N,N) PRINT*, "Y=", Y(N,N,N) END ASSOCIATE !$acc update host (POOL) PRINT*, "X/POOL=", POOL(N,N,N,IX) PRINT*, "Y/POOL=", POOL(N,N,N,IY) END PROGRAM
With ifort/gfortran4.7 no problem , as expected , X=1 & Y=2 :
gfortran test_associate.f90 -o test_associate test_associate X= 1.00000000 Y= 2.00000000 X/POOL= 1.00000000 Y/POOL= 2.00000000
First the Compilation giving some strange Warning on the use of the X & Y aliases :
pgf95 -ta=host,nvidia,cuda5.5,kepler -Minfo=accel test_associate.f90 -o test_associate PGF90-W-0155-The number of subscripts is less than the rank of pool (test_associate.f90: 18) PGF90-W-0155-The number of subscripts is less than the rank of pool (test_associate.f90: 19) 0 inform, 2 warnings, 0 severes, 0 fatal for test_associate test_associate: 13, Generating NVIDIA code Generating compute capability 3.0 binary 14, Loop is parallelizable Accelerator kernel generated 14, !$acc loop gang, vector(4) ! blockidx%y threadidx%y !$acc loop gang, vector(32) ! blockidx%x threadidx%x 17, Generating update host(y(:,:,:,:)) Generating update host(x(:,:,:,:)) 21, Generating update host(pool(:,:,:,:))
And the results are wrong , on the NVIDIA gpu but also on the HOST
ACC_DEVICE=NVIDIA test_associate X= 0.000000 Y= 0.000000 X/POOL= 1.000000 Y/POOL= 2.000000 ACC_DEVICE=HOST test_associate X= 1.000000 Y= 1.000000 X/POOL= 1.000000 Y/POOL= 2.000000