# How to parallel outer loop only

Hi,
I try to add accelerator directives into a subroutine with nested loops. The limits of inner loops are variant in the outer loops. May I parallelize the outer loop only to overcome the restriction that inner loop limits must be constant? Attached please find the code. Any suggestion is appreciated.

``````      SUBROUTINE GRID_BASED_NEIGHBOR_SEARCH

USE param1
USE discretelement
USE geometry
USE des_bc

IMPLICIT NONE
!-----------------------------------------------
! Local variables
!-----------------------------------------------
INTEGER I, II, IP1, IM1   ! X-coordinate loop indices
INTEGER J, JJ, JP1, JM1   ! Y-coordinate loop indices
INTEGER K, KK, KP1, KM1   ! Z-coordinate loop indices
INTEGER PNO ! Temp. particle number variable
INTEGER NPG ! Temp. cell particle count
INTEGER LL, NP, NEIGH_L  ! Loop Counters
INTEGER NLIM !

DOUBLE PRECISION DISTVEC(DIMN), DIST, R_LM ! Contact variables

!\$acc region do kernel copy(neighbours) copyin(pijk,des_pos_new) &
!\$acc       copyin(imin1,imax1,jmin1,jmax1,dimn,kmin1,kmax1)     &
DO LL = 1, MAX_PIS

II = PIJK(LL,1); IP1=min(II+1,imax1); IM1=max(II-1,imin1)
JJ = PIJK(LL,2); JP1=min(JJ+1,jmax1); JM1=max(JJ-1,jmin1)
KK = PIJK(LL,3); KP1=KK;   KM1=KK
IF(DIMN.EQ.3)THEN
KP1 = min(KK+1,kmax1);   KM1 = max(KK-1,kmin1)
ENDIF

DO KK = KM1, KP1
DO JJ = JM1, JP1
DO II = IM1, IP1
! Shift loop index to new variables for manipulation
I = II;   J = JJ;   K = KK
! If cell IJK contains particles, store the amount in NPG
IF(ASSOCIATED(PIC(I,J,K)%P))THEN
NPG = SIZE(PIC(I,J,K)%P)
ELSE
NPG = 0
ENDIF

! Loop over the particles in IJK cell to determine if they are
! neighbors to particle LL
DO NP = 1,NPG
PNO = PIC(I,J,K)%P(NP)

IF(PNO.GT.LL)THEN
R_LM = FACTOR_RLM*R_LM
DISTVEC(:) = DES_POS_NEW(PNO,:) - DES_POS_NEW(LL,:)
if(dimn.eq.2)then
dist=sqrt(distvec(1)**2+distvec(2)**2)
else
dist=sqrt(distvec(1)**2+distvec(2)**2+distvec(3)**2)
endif

IF(DIST .LE. R_LM) then
NEIGHBOURS(LL,1) = NEIGHBOURS(LL,1) + 1
NLIM  = NEIGHBOURS(LL,1) + 1
NEIGHBOURS(LL,NLIM) = PNO
ENDIF  !contact condition
ENDIF  !PNO.GT.LL
ENDDO  !NP

ENDDO  ! II cell loop
ENDDO  ! JJ cell loop
ENDDO  ! KK cell loop

ENDDO  ! Particles in system loop
!\$acc end region

END SUBROUTINE GRID_BASED_NEIGHBOR_SEARCH
``````

[/code]

Hi Tingwen,

You should be able to work around the rectangular loop restriction using the “kernel” clause (like you have it now). However, you’ll need to remove “ASSOCIATED” is it isn’t supported on the GPU. Also, you’ll need to privatize DISTVEC (i.e add “private(DISTVEC)” to your kernel clause).

Let me know if I missed anything by posting the output from your compile with “-Minfo=accel”.

Hope this helps,
Mat

Hi Mat,
Thanks for your prompt reply. I made the changes and commented the “associated” function by setting NPG and PNO to constants. Below is the output when I compile it with PGI 10.3. Do you have any idea what is wrong? Thanks

``````     22, No parallel kernels found, accelerator region ignored
25, Loop carried dependence due to exposed use of 'distvec(1:3)' prevents parallelization
55, Loop carried dependence due to exposed use of 'distvec(1:3)' prevents parallelization
Loop carried dependence due to exposed use of 'neighbours(i1+1,1)' prevents parallelization
56, Loop carried dependence due to exposed use of 'distvec(1:3)' prevents parallelization
Loop carried dependence due to exposed use of 'neighbours(i1+1,1)' prevents parallelization
57, Loop carried dependence due to exposed use of 'distvec(1:3)' prevents parallelization
Loop carried dependence due to exposed use of 'neighbours(i1+1,1)' prevents parallelization
70, Loop carried dependence due to exposed use of 'distvec(1:3)' prevents parallelization
Complex loop carried dependence of 'neighbours' prevents parallelization
Loop carried dependence due to exposed use of 'neighbours(i1+1,1)' prevents parallelization
76, Loop is parallelizable
``````

Hi Tingwen,

Did you add the private clause for DISTVEC?

Try replacing your “\$acc region” lines with a simpler version:

``````!\$acc region
!\$acc do kernel private(DISTVEC)
``````

This works for me, but I did have to modify your code to work around your modules. It’s possible my changes effected the behavior. If this is the case, please send the full source to PGI Customer Support (trs@pgroup.com) and ask them to send it on to me.

• Mat

Hi Mat,
Many thanks. I figured out a way to do it by replacing the vector with three scalars. Now it compiles successfully. Really appreciate your help.

Tingwen