Accelerator Region Ignored for Global Variable

Hello,

I have a function called from within an accelerated section of the main program that in turn calls a subroutine. If the parameter passed in the subroutine is defined locally within the function, the loop with the main program that is accelerated gets parallelized just fine. However, if the parameter is defined globally, the loop does not get parallelized.

Could you please explain why this is?


For example:

program main
use accel_lib

EXTERNAL FUNC1

integer :: n=10 ! size of the vector
real :: elev=0.0, azim=0.0, SPRAD(n)
integer :: i

call acc_init( acc_device_nvidia )
!$acc region
do i = 1,n
SPRAD(i) = FUNC1(elev,azim)
enddo
!$acc end region
end program

REAL FUNCTION FUNC1(ELEV,AZIM)

USE HEADER

EXTERNAL SUB1

REAL, INTENT(IN) :: ELEV,AZIM
REAL :: EXTRA
LOGICAL :: FLEARTH


CALL SUB1(ELEV, AZIM, EXTRA, 0, FLEARTH) Works
!CALL SUB1(ELEV, AZIM, SOLAZ, 0, FLEARTH) Doesn’t work

ISIS_API = 5.0

RETURN
END FUNCTION FUNC1

SUBROUTINE SUB1(ELEV,AZIM,LAT0,IBODY,FLEARTH)

!**** Argument Declarations

INTEGER, INTENT(IN) :: IBODY
REAL , INTENT(IN) :: ELEV,AZIM,LAT0
LOGICAL, INTENT(OUT) :: FLEARTH

RETURN
END SUBROUTINE SUB1

MODULE HEADER

REAL :: SOLAZ

END MODULE HEADER
[/code]

Hi Pebbles,

It seems to work for me. I did need to move the “HEADER” module before the main program. Perhaps this was the problem?

If not, can you post a reproducing example?

  • Mat
% cat test.f90
MODULE HEADER

REAL :: SOLAZ

END MODULE HEADER

program main
use accel_lib

EXTERNAL FUNC1

integer :: n=10 ! size of the vector
real :: elev=0.0, azim=0.0, SPRAD(n)
integer :: i

call acc_init( acc_device_nvidia )
!$acc region
do i = 1,n
SPRAD(i) = FUNC1(elev,azim)
enddo
!$acc end region
end program

REAL FUNCTION FUNC1(ELEV,AZIM)

USE HEADER

EXTERNAL SUB1

REAL, INTENT(IN) :: ELEV,AZIM
REAL :: EXTRA
LOGICAL :: FLEARTH


!CALL SUB1(ELEV, AZIM, EXTRA, 0, FLEARTH) !Works
CALL SUB1(ELEV, AZIM, SOLAZ, 0, FLEARTH) !Doesn't work

ISIS_API = 5.0

RETURN
END FUNCTION FUNC1

SUBROUTINE SUB1(ELEV,AZIM,LAT0,IBODY,FLEARTH)

!**** Argument Declarations

INTEGER, INTENT(IN) :: IBODY
REAL , INTENT(IN) :: ELEV,AZIM,LAT0
LOGICAL, INTENT(OUT) :: FLEARTH

RETURN
END SUBROUTINE SUB1

% pgf90 -ta=nvidia,time test.f90 -Minfo=accel -Minline -V10.8
main:
     17, Generating copyout(sprad(1:n))
         Generating compute capability 1.0 binary
         Generating compute capability 1.3 binary
     18, Loop is parallelizable
         Accelerator kernel generated
         18, !$acc do parallel, vector(256)
             CC 1.0 : 3 registers; 20 shared, 28 constant, 0 local memory bytes; 100 occupancy
             CC 1.3 : 3 registers; 20 shared, 28 constant, 0 local memory bytes; 100 occupancy

Mat,

Thanks for the fast reply. I inadvertantly left the SAVE command off of the header module. The SAVE command appears to be the problem with the parallelization. Please replace the header module with:

MODULE HEADER
REAL, SAVE :: SOLAZ
END MODULE HEADER

Hi Pebbles,

The compiler isn’t able to inline functions that have arguments with the SAVE attribute. Since SOLAZ is in module, the SAVE attribute isn’t needed. Can you try removing SAVE?

  • Mat