Proper OpenACC reduction clause on many loops within "parallel" region

Hi,

From the OpenACC spec, it seems one must put the reduction clause on all loops within a parallel region. It also seems to be needed on the parallel directive itself.
In the past, I have only put the reduction on the parallel region but not the loops and it seemed to work.

Is the following code correct to compute result=SUM(P(:)*Q(:)) (P and Q are stride-1 but smaller than nr*nt*np) ?

!$acc parallel default(present) reduction(+:result)
!$acc loop collapse(3) reduction(+:result) 
        do k=2,npm1 
          do j=2,ntm1  
            do i=1,nrm1
              l=ntm2*nrm1*(k-2)+nrm1*(j-2)+i
              if (rb0.or.i.gt.1) then
                result=result+p(l)*q(l)
              end if
            enddo
          enddo
        enddo
!$acc loop collapse(3) reduction(+:result)
        do k=2,npm1
          do j=jm0,jm1
            do i=2,nrm1
              l=(npm2*ntm2*nrm1)
     &         +(jm1-jm0+1)*nrm2*(k-2)+nrm2*(j-jm0)+(i-1)
              if (tb0.or.j.gt.1) then
                result=result+p(l)*q(l)
              end if
            enddo
          enddo
        enddo
!$acc loop collapse(3) reduction(+:result)
        do k=1,npm1
          do j=2,ntm1
            do i=2,nrm1
              l=(npm2*ntm2*nrm1)
     &         +(npm2*(jm1-jm0+1)*nrm2)
     &         +ntm2*nrm2*(k-1)+nrm2*(j-2)+(i-1)
              if (iproc_p.eq.0.or.k.gt.1) then
                result=result+p(l)*q(l)
              end if
            enddo
          enddo
        enddo
!$acc end parallel
  • Ron

Technically, yes, you are supposed to add the reduction clause on the outer parallel region. GNU will complain if you don’t. However the NVHPC compiler’s analysis can usually auto-detect reductions so is optional when using NVHPC.

-Mat