# OpenAcc in Fortran subroutine

Hello, I try to use OpenAcc to accelerate my finite difference method code.
I encounter a problem. The following is my code

``````program FDM
...
! iteration loop....
!\$acc data copy(RHS_U,TMP1,K,R,S), copyin(dx,dy)
do
...
!\$acc kernels
DO I = 2, Nx-1
DO J = 2, Ny-1
RHS_U(I,J) = K(I,J) * ( -TMP1(I-2,J) + 16.0_DP * TMP1(I-1,J) - 30.0_DP * TMP1(I,J)   + &
16.0_DP * TMP1(I+1,J) - TMP1(I+2,J) ) / ( 12.0_DP * DX**2 ) + &
K(I,J) * ( -TMP1(I,J-2) + 16.0_DP * TMP1(I,J-1) - 30.0_DP * TMP1(I,J)   + &
16.0_DP * TMP1(I,J+1) - TMP1(I,J+2) ) / ( 12.0_DP * DY**2 ) + &
- R(I,J) * TMP1(I,J) + S(I,J)
END DO
END DO
!\$acc end kernels
...
end do
! end iteration loop....
...
end program
``````

In the above code, the total computational time is about 10(s). Now, I move the RHS_U calculation into a subroutine which is called CD4. The subroutine is as follows

`````` SUBROUTINE CD4( Nx, Ny, K , R , S , dx, dy, RHS_U , TMP1 )

IMPLICIT NONE
INTEGER                       :: I , J
INTEGER       , INTENT(INOUT) :: Nx
INTEGER       , INTENT(INOUT) :: Ny

REAL(KIND=DP) , INTENT(INOUT) :: K(:,:)
REAL(KIND=DP) , INTENT(INOUT) :: R(:,:)
REAL(KIND=DP) , INTENT(INOUT) :: S(:,:)

REAL(KIND=DP) , INTENT(INOUT) :: RHS_U(:,:)
REAL(KIND=DP) , INTENT(INOUT) :: TMP1(:,:)
REAL(KIND=DP) , INTENT(INOUT) :: DX , DY

!\$acc kernels present(RHS_U,K,R,S,TMP1,DX,DY)
DO I = 2 , Nx-1
DO J = 2 , Ny-1
RHS_U(I,J) = K(I,J) * ( -TMP1(I-2,J) + 16.0_DP * TMP1(I-1,J) - 30.0_DP * TMP1(I,J)   + &
16.0_DP * TMP1(I+1,J) - TMP1(I+2,J) ) / ( 12.0_DP * DX**2 ) + &
K(I,J) * ( -TMP1(I,J-2) + 16.0_DP * TMP1(I,J-1) - 30.0_DP * TMP1(I,J)   + &
16.0_DP * TMP1(I,J+1) - TMP1(I,J+2) ) / ( 12.0_DP * DY**2 ) + &
- R(I,J) * TMP1(I,J) + S(I,J)
END DO
END DO
!\$acc end kernels

END SUBROUTINE
``````

The original code becomes

``````program FDM
...
! iteration loop....
!\$acc data copy(RHS_U,TMP1,K,R,S), copyin(dx,dy)
do
...
CALL CD4( Nx, Ny, K , R , S , dx, dy, RHS_U , TMP1 )
...
end do
! end iteration loop....
...
end program
``````

However, the total computational time becomes about 18(s), the performance is reduced. I do not know the reason. Any idea ? I use the PGI Accelerator Fortran workstation V13.8

Hi SCCS,

I’ve forgotten if we do this in v13.8, but we do use “INTENT(IN)” to determine if a read-only array can be placed in texture memory. By you using “INTENT(INOUT)” this may be inhibited. What happens is you change all but “RHS_U” to be “INTENT(IN)”?

If that doesn’t help, can you post the compiler feedback messages (-Minfo=accel) for each case? Also, please post the profile information but setting PGI_ACC_TIME=1 in your environment.

• Mat