I’m trying to apply the “!$acc reflected” clause in order to avoid needless data transfer. Unfortunately it doesn’t work, when I run my program I get the following error message:
call to cuMemcpyDtoH returned error 700: Launch failed CUDA driver version: 4000
The main program looks like:
program test_program use accel_lib implicit none common A common counter integer, dimension(1000000) :: A integer :: counter integer :: i integer, parameter :: end_iter = 10 do i=1,10 A(i)=i end do !$acc data region copy(A) do i=1,end_iter call sub1 end do !$acc end data region print*, A print*, counter end program test_program
And the subroutine “sub1” looks like:
subroutine sub1 use accel_lib implicit none common A common counter integer, dimension(1000000) :: A integer :: counter integer :: i !$acc reflected(A) !$acc region do i=1,1000000 A(i)=A(i)+1 end do !$acc end region counter = counter+1 end subroutine
What am I doing wrong? I complie like this:
pgfortan -o test main.f90 sub1.f90
My pgi version is 11.6
It seems that the array A is not being copied back to the host. If I run the program on CPU it works fine.
Thank you very much for your help!