Hello,
I am wondering about when to use the attach
clause. I have two different use cases in a Fortran code:
- A pointer to a 3D array that is allocated on the device, to be used inside a kernel.
- An array of pointers to 3D arrays that are allocated on the device, to be used inside a kernel.
See two minimal working examples below.
program p
implicit none
integer, parameter :: n = 50
call bla(n)
call bla(n)
call bla(n)
contains
subroutine bla(n)
integer, intent(in) :: n
real, allocatable, target , save :: a_t(:,:,:)
real, pointer, save :: a_p(:,:,:)
logical, save :: is_first = .true.
integer :: i,j,k
if(is_first) then
is_first = .false.
allocate(a_t(n,n,n))
a_t(:,:,:) = 0.
a_p => a_t
!$acc enter data create(a_t)
!!$acc enter data attach(a_p) ! **NOT NEEDED**
end if
!
!$acc parallel loop collapse(3) default(present)
do k=1,n
do j=1,n
do i=1,n
a_p(i,j,k) = a_p(i,j,k) + 1.
end do
end do
end do
!$acc update self(a_p)
print*,a_p(5,5,5)
end subroutine bla
end program p
This program doesn’t need the !$acc enter data attach(a_p)
directive to run correctly:
$ nvfortran -acc test.f90 && ./a.out
1.000000
2.000000
3.000000
Now, in this program:
program p
implicit none
integer, parameter :: n = 50
call bla(n)
call bla(n)
call bla(n)
contains
subroutine bla(n)
integer, intent(in) :: n
type :: arr
real, allocatable :: s(:,:,:)
end type arr
type :: arr_ptr
real, pointer, contiguous :: s(:,:,:)
end type arr_ptr
type(arr) , allocatable, target, save :: a_t(:)
type(arr_ptr), allocatable , save :: a_p(:)
logical, save :: is_first = .true.
integer :: i,j,k
if(is_first) then
is_first = .false.
allocate(a_t(2))
allocate(a_t(1)%s(n,n,n))
a_t(1)%s(:,:,:) = 0.
allocate(a_p(1))
a_p(1)%s => a_t(1)%s
!$acc enter data create(a_t,a_p)
!$acc enter data copyin(a_t(1)%s)
!!$acc enter data attach(a_p(1)%s) ! **NEEDED**
end if
!
!$acc parallel loop collapse(3) default(present)
do k=1,n
do j=1,n
do i=1,n
a_p(1)%s(i,j,k) = a_p(1)%s(i,j,k) + 1.
end do
end do
end do
!$acc update self(a_p(1)%s)
print*,a_p(1)%s(5,5,5)
end subroutine bla
end program p
Does need the !$acc enter data attach
, else:
$ nvfortran -acc test_attach.f90 && ./a.out
Failing in Thread:1
Accelerator Fatal Error: call to cuStreamSynchronize returned error 700 (CUDA_ERROR_ILLEGAL_ADDRESS): Illegal address during kernel execution
For the first program, is this valid OpenACC, or is the compiler ensuring the “attachment”? In the latter case, would a compiler output Generating implicit attach, similar to other outputs, make sense here (although perhaps the compiler is not generating OpenACC code to ensure the “attachment”)?
If the code is not (OpenACC) standard-conforming, I guess it would be good to always use an attach
directive in cases like this.
Thanks in advance!
Pedro