Dear all,
Consider the Fortran code w/ OpenACC below, which has loops with a shared 3D array, and a private 1D array. The code crashes (out-of-bounds) with compute-sanitizer
for n > 40
when I use an !$acc kernels loop
directive, but runs well when I use an !$acc parallel loop
directive. I wonder if this is expected due to some bad practice on my end, or if this unveils a compiler bug? -Minfo=accel
seems to hint that the generated kernels are equivalent…
To test the code:
for MODE in GOOD BAD; do nvfortran -acc -Minfo=accel -cpp -D_${MODE} test.f90 && compute-sanitizer ./a.out; done
The code:
program p
implicit none
integer, parameter :: n = 41 ! works on my Quadro P2000 for n <= 40
real, allocatable, dimension(:,:,:) :: p2d
real, allocatable, dimension(:) :: p1d
integer :: i,j,k
!
allocate(p1d(n))
allocate(p2d(n,n,n))
!$acc enter data create(p1d,p2d)
#if defined(_GOOD)
!$acc parallel loop collapse(3) default(present) private(p1d)
do k=1,n
do j=1,n
do i=1,n
p1d(i) = 1.*j*k
p2d(i,j,k) = p1d(i)
enddo
enddo
enddo
!$acc end parallel
#elif defined(_BAD)
!$acc kernels loop collapse(3) default(present) private(p1d)
do k=1,n
do j=1,n
do i=1,n
p1d(i) = 1.*j*k
p2d(i,j,k) = p1d(i)
enddo
enddo
enddo
!$acc end kernels
#endif
!$acc exit data copyout(p2d)
if( int(p2d(10,9,10)) == 90 ) then
print*, 'Success!'
else
print*, 'Failure!'
endif
end
Thanks in advance for your feedback! (I posted this question in an OpenACC slack group, but it seems appropriate to ask it here.)