Hi
I’m trying to apply the “!$acc reflected” clause in order to avoid needless data transfer. Unfortunately it doesn’t work, when I run my program I get the following error message:
call to cuMemcpyDtoH returned error 700: Launch failed
CUDA driver version: 4000
The main program looks like:
program test_program
use accel_lib
implicit none
common A
common counter
integer, dimension(1000000) :: A
integer :: counter
integer :: i
integer, parameter :: end_iter = 10
do i=1,10
A(i)=i
end do
!$acc data region copy(A)
do i=1,end_iter
call sub1
end do
!$acc end data region
print*, A
print*, counter
end program test_program
And the subroutine “sub1” looks like:
subroutine sub1
use accel_lib
implicit none
common A
common counter
integer, dimension(1000000) :: A
integer :: counter
integer :: i
!$acc reflected(A)
!$acc region
do i=1,1000000
A(i)=A(i)+1
end do
!$acc end region
counter = counter+1
end subroutine
What am I doing wrong? I complie like this:
pgfortan -o test main.f90 sub1.f90
My pgi version is 11.6
It seems that the array A is not being copied back to the host. If I run the program on CPU it works fine.
Thank you very much for your help!