OpenACC +MPI / Assignment of the variable failed

Hi everyone, I’m compiling and running a Fortran code with mpif90 (nvhpc 23.1). The code is complex and for the sake of clarity, I’m reporting only the major parts related to my issue. The code is MPI and I’m accelerating it with OpenACC.

The part of the code is the following:

SUBROUTINE kin
USE global_mod, ONLY: NsMAX, num_zones, zones, MINi, MAXi, MINj, MAXj, MINk, MAXk
USE common_alloc

INTEGER, VALUE :: B, i, j, k, l, ll, s, icomp

DOUBLE PRECISION, VALUE :: OM(NUMSP), Yi_ijk(NsMAX), Xir(NsMAX)
DOUBLE PRECISION, VALUE :: T_ijk, p_ijk, dens, Rgst,S_y, Wmix_ijk, D_hp,rho_ijk
DOUBLE PRECISION, VALUE :: VREAZ(1000)
INTEGER, VALUE :: NsMAT
INTEGER, VALUE :: iR, CC, NUMDICOMP, ISPECIE
INTEGER, VALUE :: NUMDICOMPF, NUMDICOMPB, REV, TERZOCORPO
DOUBLE PRECISION, VALUE :: Kchem, Kchem_B, PRODF, PRODB, KEQ
DOUBLE PRECISION, VALUE :: NIF, NIB, NII!, omF(NUMSP),omB(NUMSP)

DOUBLE PRECISION, VALUE :: SOMMAH,SOMMAS,TEM
DOUBLE PRECISION, VALUE :: app

DOUBLE PRECISION, VALUE :: Y(NsMAX)


!--------------------------------------------------------------------------------------------------------

!$acc parallel loop collapse(3) private(Yi_ijk,iR,Xir) copy(p,T,Yi,rho,om,omega,Wmix_ijk,Wmix,icomp,dens,Kchem,TERZOCORPO,KEQ,MAT_PLOG,REACT_PLOG,TAB2,TAB2M,TROE,SRI,TAB1F,TAB1B,TAB3F,TAB3B,NUMDISPREAZF,NUMDISPREAZB,DHF,DSF,NUMSP,p_AL,p_AU,MASS,ALFAM,NUMDISPM,TABM,TRIF,NII,PRODB,PRODF)
 do k= MINk(BBB)-(Ghost-1), MAXk(BBB)+(Ghost-1)
  do j= MINj(BBB)-(Ghost-1), MAXj(BBB)+(Ghost-1)
   do i= MINi(BBB)-(Ghost-1), MAXi(BBB)+(Ghost-1)

    !!!   ...Some calculations...

!!*********** OM CALCULATION********************************
    !$acc loop independent
    DO iR=1, NUMREAZ

     NUMDICOMP=NUMDISPREAZF(iR)
!-----------------first contribution--------------------------
     !$acc loop independent
     DO CC=1,NUMDICOMP
      ISPECIE=TAB1F(iR,CC)
      NII=TAB3F(iR,CC)
      OM(ISPECIE)=OM(ISPECIE)-VREAZ(iR)*(NII)*(MASS(ISPECIE))
     END DO

     NUMDICOMP=NUMDISPREAZB(iR)
!-----------------second contribution------------------------
     !$acc loop independent
     DO CC=1,NUMDICOMP
      ISPECIE=TAB1B(iR,CC)
      NII=TAB3B(iR,CC)
      OM(ISPECIE)=OM(ISPECIE)+VREAZ(iR)*(NII)*(MASS(ISPECIE))
     END DO

    END DO  !loop on NUMREAZ
    !********************************
    
        if(i==50.and.j==50.and.k==1) then ! i,j,k value random
                print*, "OM = ",OM(16)
        end if

      omega(1:NUMSP,i,j,k) = OM(1:NUMSP)*1000.D0

       if(i==50.and.j==50.and.k==1) then
                print*, "OMEGA..",omega(16,i,j,k)
        end if

   end do 
  end do 
 end do 


END SUBROUTINE kin

The issue is the following: when I calculate OM, I obtain a number and I print it for i=50, j=50 and k=1. No problem so far. Then I assign OM to omega and when I print omega for the same i,j,k value I found that the value is completely different. In particular is like I cannot access to omega variable. This variable is declared in common_alloc as:

double precision, dimension(: , : , : , : ), pointer :: omega

It is also allocated and deallocated correctly. What can be the problem?

For the sake of completeness I would like to point out that the code is correctly compiled and also run without problems but the value of omega remain zero forever.

I do see a couple errors in your code.

First “NII” is in a copy clause which will promote it to being a shared variable causing a race condition. Scalars are first private by default, so you should remove NII from the “copy” clause. I’d recommend you not include scalars in the copy clause at all unless you specifically wish to make it shared since it will help prevent these types of errors.

However arrays are shared by default and since “OM” isn’t privatized, you have a second race condition. You’ll want to include “OM” in the private clause.

These could be causing your issues, but I would expect wrong answers rather than omega always being zero. But please give it try.

If that doesn’t fix the issue, given pointers are special, you may also want to try adding the dimensions when copying omega. Something like “copy(…,omega(:,:,:,:), …”, or even the with the explicit bounds, “omega(1:NUMSP,1:sze1,1:sze2,1:sze3)”. Otherwise the code may just be copying the pointer but not the data it points to. Granted, this shouldn’t be necessary if you’re simply allocating the pointer, but is sometimes necessary when the pointer is assigned.

-Mat

Hi Mat,

I implemented all your suggestion but unfortunately the copy issue still persist. It’s like omega variable does not exist on the GPU device. I don’t think my last statement is true but it’s just my impression.

Now the acceleration directive is the following:

!$acc parallel loop collapse(3) private(Yi_ijk,iR,Xir,OM) copy(p,T,Yi,rho,omega(NsMAX, MINi(BBB)-(Ghost-1): MAXi(BBB)+(Ghost-1),MINj(BBB)-(Ghost-1): MAXj(BBB)+(Ghost-1), MINk(BBB)-(Ghost-1): MAXk(BBB)+(Ghost-1)))

I tried also copy(…,omega(:,:,:,:) but it doesn’t work. For the sake of completeness I report the flag I’m using to compile the code:

-r8 -acc=gpu,noautopar -target=gpu -gpu=ccall,zeroinit -Mpreprocess -Mfree -Mextend -Munixlogical -Mbyteswapio -traceback -Mchkstk -Mnostack_arrays -Mnofprelaxed -Mnofpapprox -Minfo=accel

Thank you again for your support!

-Matteo

Do you have a full reproducing example you could share? If you can’t post publicly, feel free to direct message me.

If you can’t share the code, can you post the compiler feedback message (i…e add “-Minfo=accel”)? That might gives some clues.

I’d also try adding “gang vector” to the outer “parallel loop” so the inner loops are run serially in case there’s some race condition I’m not seeing offhand with the inner loops or something in the missing “Some calculations” that is shared by the inner loops.