Simple question about atomic in openacc

Thanks a lot to take a look at this!

It might be simple question, but how can I get the code below to be parallelized using openacc? I think I might want to use atomic. But not so sure.

 a= 0.0d0
    do iz = 1, zn
    do it = 1, tn
        do iiz = 1, zn
        do iit = 1, tn

          a(iiz,iit,:,:) =  a(iiz,iit,:,:) + b(iz,it,:,:) * Z(iz,iiz) * T(it,iit) )
        end do
        end do
    end do
    end do

Hi lintaejun,

The problem here is that the “iz” and “it” aren’t parallel. Also atomic only works on a single reference, not on array syntax which expanded into implicit loops. Even if it did, atomic would severely hurt your performance since every update would need an atomic operation.

If this is the only code in this loop, I’d suggest using explicit loops instead of array syntax, then move the “iz” and “it” to be the innermost loops. You can then optionally use a reduction on the inner loops depending upon if you need more parallelism or if you need each kernel to do more work.

Something like:

!$acc kernels loop   ! Try adding collapse(4) if zn,tn are small
do iiz = 1,zn
do iit = 1,tn
do iiiz = 1,zn
do iit = 1,tn  ! set the correct loop bounds
asum = 0.0d0
!!!! optionally try using a reduction 
!!!! Also using vector here may help the data access for b, Z, and T
!!!$acc loop vector collapse(2) reduction(+:asum)
do iz =1,zn
do it = 1,tn
asum= asum+ b(iz,it,iiiz,iiit) * Z(iz,iiz) * T(it,iit) )
end do
end do
a(iiz,iit,iiiz,iiit) = a(iiz,iit,iiiz,iiit)+asum 
end do
end do
  • Mat