Hello,
I work in the CINECA User Support Team, and we received from one of our users a report on a code which produces wrong numerical results.Please note that we found how to change the code for it to produce correct results, but we actually wonder if the original code should work as well or, better, how to explain why it is wrong.
Below you find the original Fortran openACC code posted by our user, and how he compiles it. It shows that,:
- when using the vector clause for the parallel loop directive
- if the loop instructions involve some operation on the array returned by a function call
- as a result all threads produce the same value (the one corresponding to i=1), see output array a.
The code also reports the (correct) output for the array b, obtained by saving the return value of get_array in a local (array) variable c, and then operating on c.
Note that: - without the vector clause it works
- with acc kernels instead of parallel loop it works
- with the vector clause and scalar variables (replacing the array get_arr) it works
Is this an expected behaviour?
Many thanks in advance for any suggestion you may have,
best
Isabella
type ! compile with:
! nvfortran -c -o test.o test.F90 -cuda -acc -gpu=cc70 -Minfo=accel -g -r8 -traceback -Mnoinline
! nvfortran -o test test.o -cuda -acc -gpu=cc70 -Minfo=accel -g -r8 -traceback -Mnoinline
module simple
contains
function get_arr(a)
!$acc routine seq
integer, dimension(2) :: get_arr
integer, intent(in) :: a
get_arr(1) = a
get_arr(2) = a
end function get_arr
end module simple
program testprogram
use simple
implicit none
integer, parameter :: n = 16
integer, dimension(n) :: a, b
integer, dimension(2) :: c
integer :: i
write (*,*) "test start"
!$acc parallel loop gang worker vector &
!$acc private(c, i) copyout(a, b)
do i = 1, n
c = get_arr(i) * 2
a(i) = c(1)
c = get_arr(i)
b(i) = c(1) * 2
end do
write (*,*) "result"
write (*,*) "a"
write (*,*) a
write (*,*) "b"
write (*,*) b
end program testprogram
or paste code here