Dear all,
When trying to test the use of shared memory (using NVHPC 22.7) on my old P2000 GPU, I got an error. The following simplified program below results in out-of-bounds errors when compiled and ran as follows:
nvfortran -acc test_shared.f90 && compute-sanitizer ./a.out
This is the program
program test_shared
implicit none
integer , parameter :: nx = 32, ny = 32
real(8), allocatable, dimension(:,:) :: a
real(8), allocatable, dimension(:,:) :: b
integer :: i,j
real(8) :: aip,aim,ajp,ajm
allocate(a(0:nx+1,0:ny+1))
allocate(b(0:nx+1,0:ny+1))
a(:,:) = 1.
!$acc enter data copyin(a) create(b)
!$acc parallel loop collapse(2) default(present) private(aip,aim,ajp,ajm)
do j=1,ny
do i=1,nx
!$acc cache(a(i-1:i+1,j-1:j+1))
aip = a(i+1,j)+a(i,j)
aim = a(i-1,j)+a(i,j)
ajp = a(i,j+1)+a(i,j)
ajm = a(i,j-1)+a(i,j)
b(i,j) = aip + aim + ajp + ajm + akp + akm
end do
end do
!$acc update self(b)
print*,'b(10,10) = ', b(10,10)
end
while the code runs fine if I remove the cache
directive. If I don’t use compute-sanitizer
, the code will crash on my P2000 GPU for sufficiently large array sizes, which unfortunately is the case in my actual problem :(.
Thanks in advance!