For the code below, I get different results on the CPU and GPU. The differences appear in the array PFT. Repeated running will change the differences, and occasionally may produce no difference.
!$acc data copy(NWALLS, MAX_PIP, WALLDTSPLIT, WALLCONTACT(1:MAX_PIP,1:NWALLS), PEA(1:MAX_PIP,1:4), & !$acc& DES_POS_NEW(1:MAX_PIP,1:DIMN), W_POS_L(1:MAX_PIP,1:NWALLS,1:DIMN), PFT(1:MAX_PIP,1:MAXNEIGHBORS,1:DIMN) ) !$acc parallel !$acc loop gang, private(LL, NI, IW, PFT_TMP(1:DIMN), DIST(1:DIMN) ) DO LL = 1, MAX_PIP IF(.NOT.PEA(LL,1) .OR. PEA(LL,4) ) CYCLE DO IW = 1, NWALLS IF(.NOT.WALLDTSPLIT .OR. PEA(LL,2) .OR. PEA(LL,3) .OR. WALLCONTACT(LL,IW).NE.1 ) GOTO 200 NI=IW !Line added by AJ for debugging DIST(:)=ZERO !Line added by AJ for debugging DIST(:) = w_pos_l(LL,IW,:) - DES_POS_NEW(LL,:) ! Save the tangential displacement history with the correction of Coulomb's law PFT_TMP(:)=DIST(:) IF (PARTICLE_SLIDE) THEN ELSE PFT(LL,NI,:) = PFT_TMP(:) ENDIF PARTICLE_SLIDE = .FALSE. 200 CONTINUE ENDDO ! DO IW = 1, NWALLS ENDDO !Loop over particles LL to calculate wall contact !$acc end parallel !$acc end data
When I comment out the line
DIST(:) = w_pos_l(LL,IW,:) - DES_POS_NEW(LL,:)
I get identical PFT array from both CPU and GPU runs.
I checked that the arrays DES_POS_NEW and W_POS_L remain same even when PFT differs.
Note that the code pasted above is a stripped down version of a part of the file model/des/calc_force_des.f from the MFIX code.