Apparent bug in Fortran device-to-host copies above 2GB

The code below generates an unspecified launch failure when compiled using the version 13.3 Fortran compiler:

array size = 2.000000000000000 GB
0: copyout Memcpy (host=0x2b25cc386020, dev=0xf00100000, size=-2147483648) FAILED: 4(unspecified launch failure)

The array size appears to have been passed to Memcpy as a 32 bit integer.
The same code works correctly with the 12.9 compiler, and with smaller
arrays using the 13.3 compiler.

implicit none
integer, parameter :: mx=2561024**2
integer i
real
8, device, dimension(mx) :: A,B
real8, allocatable, dimension(:) :: Ahost
allocate(Ahost(mx))
write (6,
) “array size = “,mx*8d0/1024**3,” GB”
B = 42d0
!$cuf kernel do(1) <<< , * >>>
do i=1,mx
A(i)=B(i)
enddo
Ahost = A
write (6,
) Ahost(4)
end

Thanks Paul. I’ve recreated the problem here and have sent a report (TPR#19285) to engineering for further investigation. You can work around this issue by using allocatable instead of fixed size device arrays.

% cat testWA.f90 
implicit none
integer, parameter :: mx=256*1024**2
integer i
real*8, device, allocatable, dimension(:) :: A,B
real*8, allocatable, dimension(:) :: Ahost
allocate(Ahost(mx))
allocate(A(mx))
allocate(B(mx))
write (6,*) "array size = ",mx*8d0/1024**3," GB"
B = 42d0
!$cuf kernel do(1) <<< *, * >>>
do i=1,mx
A(i)=B(i)
enddo
Ahost = A
write (6,*) Ahost(4)
end
% pgf90 -Mcuda testWA.f90 -V13.3 -Mlarge_arrays ; a.out
 array size =     2.000000000000000       GB
    42.00000000000000

Best Regards,
Mat

Out now.

dave