reflected clause - subroutine

Hi
I’m trying to apply the “!$acc reflected” clause in order to avoid needless data transfer. Unfortunately it doesn’t work, when I run my program I get the following error message:

call to cuMemcpyDtoH returned error 700: Launch failed
CUDA driver version: 4000

The main program looks like:

      program test_program
      use accel_lib
      implicit none
      
      common A
      common counter   
      integer, dimension(1000000)   :: A
      integer   :: counter

      integer   :: i
      integer, parameter   :: end_iter = 10

      do i=1,10
         A(i)=i
      end do

      !$acc data region copy(A)

      do i=1,end_iter
         call sub1
      end do

      !$acc end data region

      print*, A
      print*, counter

      end program test_program

And the subroutine “sub1” looks like:

      subroutine sub1
      
      use accel_lib
      implicit none

      common A
      common counter
      integer, dimension(1000000)   :: A
      integer   :: counter          
      
      integer   :: i

      !$acc reflected(A)

      !$acc region      
         do i=1,1000000
            A(i)=A(i)+1
         end do
      !$acc end region   
      counter = counter+1
      
      end subroutine

What am I doing wrong? I complie like this:

pgfortan -o test main.f90 sub1.f90

My pgi version is 11.6
It seems that the array A is not being copied back to the host. If I run the program on CPU it works fine.
Thank you very much for your help!

Hi elephant,

Reflected is for use with routine arguments where the routine has an explicit interface. Here you’re trying to use it with a common block without an explicit interface. Probably the easiest thing to do is rewrite the code to put sub1 into a module, move A from a common block to a module allocatable array, and finally use the mirrored directive instead of reflected (see below for an example). If you do want to use reflected, then instead of using a common block, declare A as a local array in your main program, then pass it to sub1 as an argument. You’ll also need to add an interface to sub1.

  • Mat
      module foo
      integer, allocatable, dimension(:)   :: A
!$acc mirror(A)
      integer   :: counter   
      integer, parameter :: sze=1000000 

      contains
      subroutine sub1
     
      use accel_lib
      implicit none
     
      integer   :: i

      !$acc region     
         do i=1,sze
            A(i)=A(i)+1
         end do
      !$acc end region   
      counter = counter+1
     
      end subroutine 
      end module foo

      program test_program
      use accel_lib
      use foo
      implicit none
     
      integer   :: i
      integer, parameter   :: end_iter = 10

      allocate(A(sze))
      A=0
      do i=1,10
         A(i)=i
      end do
 !$acc update device(A)
      !acc data region 

      do i=1,end_iter
         call sub1
      end do
 !$acc update host(A)

      !acc end data region

      print*, A(10),A(100)
      print*, counter

      end program test_program