Hi all,
I am able to compile the Fortran code snippet below with OpenACC compiler directives:
!$acc parallel loop reduction(+:potential_dot_dot_acoustic) private(isource,iglob,ispec,i,j,k)
DO isource = 1,NSOURCES
ispec = ispec_selected_source(isource)
IF (ispec_is_inner(ispec) .eqv. phase_is_inner) THEN
DO k=1,NGLLZ
DO j=1,NGLLY
DO i=1,NGLLX
iglob = ibool(i,j,k,ispec)
potential_dot_dot_acoustic(iglob) = potential_dot_dot_acoustic(iglob) - sourcearrays(isource,1,i,j,k) * stf_pre_compute(isource) / kappastore(i,j,k,ispec)
END DO
END DO
END DO
END IF ! ispec_is_inner
END DO ! NSOURCES
!$acc end parallel
However, I am unable to parallelize all the loops. The nvfortran compiler yields the following messages:
97, Generating implicit copyin(kappastore(:,:,:,:)) [if not already present]
Generating implicit firstprivate(nsources)
Generating NVIDIA GPU code
98, !$acc loop gang, vector(128) ! blockidx%x threadidx%x
Generating reduction(+:potential_dot_dot_acoustic(:))
101, !$acc loop seq
102, !$acc loop seq
103, !$acc loop seq
97, Generating implicit copyin(ibool(:,:,:,:),ispec_is_inner(:)) [if not already present]
Generating copy(potential_dot_dot_acoustic(:)) [if not already present]
Generating implicit copyin(stf_pre_compute(:nsources),sourcearrays(:nsources,:1,:,:,:),ispec_selected_source(:nsources)) [if not already present]
98, Generating implicit firstprivate(phase_is_inner)
101, Loop carried dependence of potential_dot_dot_acoustic prevents parallelization
Loop carried backward dependence of potential_dot_dot_acoustic prevents vectorization
102, Loop carried dependence of potential_dot_dot_acoustic prevents parallelization
Loop carried backward dependence of potential_dot_dot_acoustic prevents vectorization
103, Complex loop carried dependence of potential_dot_dot_acoustic prevents parallelization
Loop carried dependence of potential_dot_dot_acoustic prevents parallelization
Loop carried backward dependence of potential_dot_dot_acoustic prevents vectorization
Does anyone have any suggestions on more efficient parallelization?
Thanks,
Jyoti