Does the WAIT directive work within accelerated routines? If not, is there a way to specify that an accelerated routine’s loop is to finish before its execution can continue?
For illustration purposes only:
module Vars
integer A(:),B(:)
!$acc declare create(A,B)
end module Vars
subroutine Calc()
!$acc routine worker
use Vars
…
!$acc loop
do index=1,1000
A[index] = Value(index)
end do
!-- WAIT-1:
!$acc wait
!$acc loop
do index=1,1000
A[index] = A(index)+ Calc2(index)
end do
!-- WAIT-2:
!$acc wait
end subroutine Calc
integer function Calc2(max)
!$acc routine worker
use Vars
integer max;
Calc2=0
!$acc loop
do index=1,max
Calc2 = Calc2+B(index)
end do
!-- WAIT-3:
!$acc wait
end function Calc2
…
allocate(A(0:10000))
allocate(B(0:10000))
do index=1:10000
B[index]=index
end do
!$acc update device(A,B)
!$acc parallel
call Calc
!$acc end parallel
!$ WAIT-4:
!$acc wait
!$acc update self(A)
Will WAIT-1 in Calc prevent its second loop from executing until the first one is done? What about Calc2’s WAIT-3? Is it needed or is there an implicit barrier at the end of routines?
And lastly, is WAIT-4 needed before the update or is it implicit in the barrier for the compute construct?