OpenACC: Complete-specification of Data-clauses is giving wrong answer

I am using pgfortran on PGI suite 19.10.
I have very-simple nested loop, which just calculates the total-num of iterations.

program main

  implicit none
  integer i,j,N,ans

  N = 1000
  ans = 0

!$acc data copy(ans) copyin(N) create(i,j)

!$acc parallel loop private(i,j) reduction(+:ans)
do i=1,N
do j=1,N
ans=ans+1
enddo
enddo
!$acc end parallel loop

!$acc end data

 write(*,*) 'ans= ',ans

end program main

Now, each time, i run the program, i get a different answer!!!
Same problem if i replace the create(i,j) with copy(i,j).

BUT, if I

  • remove create(i,j) on data-construct, OR
  • remove private(i,j) on compute-construct, OR
  • remove both of the above,
    I get correct answer…

Not sure, why the complete specification (which i guess is correct, and as intended) is giving wrong answer??

Also the compiler messages (-Minfo=all) is SAME in both cases.

Pl help.
Thanks,
arun

Hi arun,

The problem is that by putting “i” and “j” in a “create” or “copy” clause, you’ve overridden the default making these scalars shared. Hence you’ll get a race condition.

Note that loop index variables are unique in that they are implicitly private hence the “private(i,j)” is ignored.

Also the compiler messages (-Minfo=all) is SAME in both cases.

Are you sure? They seem different when I compile the code:

% diff  arun.F90 arunNoCreate.F90
8c8
< !$acc data copy(ans) copyin(N) create(i,j)
---
> !$acc data copy(ans) copyin(N)
% pgfortran -Minfo=accel -acc arun.F90 -o arun.out
main:
      8, Generating create(j) [if not already present]
         Generating copyin(n) [if not already present]
         Generating copy(ans) [if not already present]
         Generating create(i) [if not already present]
     10, Generating Tesla code
         10, Generating reduction(+:ans)
         11, !$acc loop gang ! blockidx%x
         12, !$acc loop vector(128) ! threadidx%x
     12, Loop is parallelizable
% pgfortran -Minfo=accel -acc arunNoCreate.F90 -o arun_nc.out
main:
      8, Generating copy(ans) [if not already present]
         Generating copyin(n) [if not already present]
     10, Generating Tesla code
         10, Generating reduction(+:ans)
         11, !$acc loop gang ! blockidx%x
         12, !$acc loop vector(128) ! threadidx%x
     12, Loop is parallelizable

-Mat

I was thinking if copy+private combination should behave as firstprivate??
Looks like it is not so.
Ok fine.

By SAME compiler messages, i meant the parallelization part (not the copy/create part), starting from: 10, Generating Tesla code

arun

Sorry, not sure why you got that impression. Variables should only be in either a copy clause or a private clause, not both.