Hello!
Compiling the code:
299 !!$acc loop collapse(3) independent
300 DO i = max(its,ibe-spec_zone+1), itf
301 b_dist = ibe - i
302 DO k = kts, ktf
303 DO j = max(jts,b_dist+jbs+1), min(jtf,jbe-b_dist-1)
304 mu_old_ = muts(i,j) - dt*mu_tend(i,j)
305
306 field(i,k,j) = field(i,k,j)*mu_old_/muts(i,j) + &
307 dt*field_tend(i,k,j)/muts(i,j) + &
308 ph_save(i,k,j)*(mu_old_/muts(i,j) - 1.)
309
310 ENDDO
311 ENDDO
312 ENDDO
I see
300, Loop is parallelizable
Accelerator kernel generated
300, !$acc loop gang, vector(128) ! blockidx%x threadidx%x
302, Loop is parallelizable
303, Complex loop carried dependence of 'field' prevents parallelization
Loop carried dependence of 'field' prevents parallelization
Loop carried backward dependence of 'field' prevents vectorization
Inner sequential loop scheduled on accelerator
PGI launches Grid=1 block=128
I don’t see any data dependency. Uncommenting “acc loop” instruction on line 299 results in
300, Loop is parallelizable
Accelerator kernel generated
300, !$acc loop gang, vector(128) ! blockidx%x threadidx%x
302, Loop is parallelizable
303, Loop is parallelizable
So there is no information that two inner loops were parallelized. Does not PGI trusts programmer?
In this case launch configuration is the same.
PGI 13.9 was used
WBR,
Alexey