Hi everyone,
I do not understand why the jl loop is executed sequentially with PGI 17.10 while it gets parallelized with the cray compiler. We use the second version below now which works for both but still it would be interesting to know.
!$acc parallel
!$acc loop seq
DO jk = itop,klevm1
ztest = 0._wp
!$acc loop gang vector reduction(+:ztest)
DO jl = 1,kproma
ptke(jl,jk) = bb(jl,jk,itke) + tpfac3*pztkevn(jl,jk)
ztest = ztest+MERGE(1._wp,0._wp,ptke(jl,jk)<0._wp)
END DO
IF(ztest.NE.0._wp) THEN
exit
ENDIF
END DO
!$acc end parallel
ztest = 0._wp
!$acc parallel
!$acc loop seq
DO jk = itop,klevm1
!$acc loop gang vector reduction(+:ztest)
DO jl = 1,kproma
ptke(jl,jk) = bb(jl,jk,itke) + tpfac3*pztkevn(jl,jk)
ztest = ztest+MERGE(1._wp,0._wp,ptke(jl,jk)<0._wp)
END DO
END DO
!$acc end parallel
IF(ztest.NE.0._wp) THEN
CALL finish('vdiff_tendencies','TKE IS NEGATIVE')
ENDIF
Thank you for your answer.