I’m currently learning PGI CUDA fortran (PGI workstation 10.9, Win XP 32), and made a little test code. It allocates an allocatable array on host. It works well when the size of array is small, however, the allocation fails when the array gets big. Moreover, one of the curious thing is that I don’t even call the subroutines.
Also, this happens only when
attributes(global)
is declared. I’ll attach the code and the result.
Is anybody here who knows what happens? Is there a limit in allocating an array, or am I missing anything important? I’ll be very glad to hear some.
- Sungjin, Kim.
module linear_system_cu
use cudafor
contains
attributes(global) subroutine jacobi_kernel(a, b, x, x_new, n)
implicit none
real, device :: a(n,n), b(n)
real, device :: x_new(n), x(n)
integer, value :: n
end subroutine jacobi_kernel
subroutine jacobi(a, x, b, tol)
implicit none
real, dimension(:,:), intent(in) :: a
real, dimension(:), intent(inout) :: x
real, dimension(:), intent(in) :: b
real, intent(in) :: tol
end subroutine jacobi
end module linear_system_cu
program alloc
use linear_system_cu
implicit none
real, dimension(:,:,:), allocatable :: a
integer :: ierr
write(*,*) "Test 1."
allocate(a(5, 100, 100), stat=ierr)
if (ierr /= 0) then
write(*,*) "Could not allocate a."
else
write(*,*) "Allocated a."
end if
if (allocated(a)) then
deallocate(a)
end if
write(*,*) "Test 2."
allocate(a(5, 10000, 10000), stat=ierr)
if (ierr /= 0) then
write(*,*) "Could not allocate a."
else
write(*,*) "Allocated a."
end if
end program alloc
The result is;
PGI$ pgf90 -Mcuda alloc.f90 linear_system_cu.f90
alloc.f90:
linear_system_cu.f90:
PGI$ ./alloc.exe
Test 1.
Allocated a.
Test 2.
Could not allocate a.
However, if modified like this;
subroutine jacobi_kernel(a, b, x, x_new, n)
The result becomes
PGI$ pgf90 -Mcuda alloc.f90 linear_system_cu.f90
alloc.f90:
linear_system_cu.f90:
PGI$ ./alloc.exe
Test 1.
Allocated a.
Test 2.
Allocated a.