Hi,
While investigating differences between GPU and CPU code, I have found that acc update gives wrong results with fortran array having indices not starting at 1.
I have reproduce the behavour in the following example:
module db_gpu
implicit none
contains
subroutine check_b(b,b_ref)
real*8 :: b(:,:),b_ref(:,:)
!$acc reflected(b)
!$acc update host(b)
print*, 'diff', sum(b-b_ref,1)
print*, 'b', sum(b,1)
print*, 'b_ref', sum(b_ref,1)
end subroutine check_b
end module db_gpu
program main
USE db_gpu
implicit none
integer*4 :: N,nlev,i,k,itime,nt
real*8, allocatable :: a(:,:), b(:,:), b_ref(:,:)
integer*4 :: dt1(8), dt2(8), t1, t2,istat
real*8 :: rt
N=1E3
nlev=4
nt=1000
allocate(a(N,nlev))
allocate(b(N,2:nlev+1),b_ref(N,2:nlev+1))
do k=1,nlev
do i=1,N
a(i,k)=cos(6.3*(k-1)/10.0)+(1.0*i)/N
b(i,k+1)=2*cos(6.3*(k-1)/10.0)+(1.0*i)/N
b_ref(i,k+1)=2*cos(6.3*(k-1)/10.0)+(1.0*i)/N
end do
end do
!$acc data region local(a,b)
!$acc update device(a,b)
call check_b(b,b_ref)
!$acc region do kernel
do i=1,N
do k=1,nlev
a(i,k)=a(i,k)*b(i,k+1)
end do
end do
!$acc end region
!$acc end data region
print*, 'sum(a)=',sum(a)
end program main
The problem is related here to array b with dimension: b(N,2:nlev+1).
The routine check_b compare and print differences between CPU reference values and array b after it was updated on device and then back on the host.
Here are the results:
> pgf90 -ta=nvidia -o test_error_ind test_error_ind.f90
> ./test_error_ind
diff 2605.891802042723 383.9451061487198
1004.421294629574 1239.255130857229
b 2500.500023603439 2500.500023603439
2116.554917454720 1112.133622825146
b_ref -105.3917784392834 2116.554917454720
1112.133622825146 -127.1215080320835
sum(a)= 3802.033573910594
As you can see values of b and b_ref differ although the only operation on b was acc update device and acc update host.
If you look carrefully at the lines showing sum of b and b_ref over first dimension, you can see that the values in b are shifted by 1 index !
Unless I am doing something wrong here I think this is quite a serious bug which should be adressed.
Regards,
Xavier
Note: pgi version 11.8