Hi,
I have a the following simple Fortran code:
program myProgram
use cudafor
use constants
implicit none
integer :: st
real, dimension(:,:,:) , allocatable :: x, y, z
allocate (x(NX, NY, NZ), stat=st); if ( st /= 0 ) stop " Unable to allocate x(:,:,:)"
allocate (y(NX, NY, NZ), stat=st); if ( st /= 0 ) stop " Unable to allocate y(:,:,:)"
allocate (z(NX, NY, NZ), stat=st); if ( st /= 0 ) stop " Unable to allocate z(:,:,:)"
call grid(x, y, z)
deallocate(x, y, z)
end program
Variables NX, NY, NZ are defined within the following module:
module constants
implicit none
integer, parameter :: NX = 1024
integer, parameter :: NY = 1024
integer, parameter :: NZ = 1024
end module constants
The subroutine “grid” is the following one:
subroutine grid(x, y, z)
use cudafor
use constants
implicit none
real, intent(out), dimension(NX, NY, NZ) :: x, y, z
integer :: ix, iy, iz
do concurrent (ix=1:NX, iy=1:NY, iz=1:NZ)
x(ix, iy, iz) = (ix-1)*dx
y(ix, iy, iz) = (iy-1)*dx
z(ix, iy, iz) = (iz-1)*dx
end do
end subroutine
If I compile the code using nvfortran with these flags: “-O3 -cuda -stdpar=gpu”, when I run it I get the following error:
“__man_alloc04: call to cuMemAllocManaged returned error 2: Out of memory
Aborted”
Instead, if I compile the code changing the flags to “-O3 -cuda -stdpar=multicore” everything works fine.
If I lower the dimensions NX, NY, NZ, using for example NX=NY=NZ=32 the program works also with the -stdpar=gpu flag.
I’m running on Ubuntu 22.04 using Windows 10 subsystem for Linux (WSL2). My graphic card is an Nvidia Quadro T2000 and the system RAM is 64 GB.
Anyone can help me in understanding if there something wrong in my code or in the compilation? Or is there some compatibility problem with my OS/hardware configuration?
Thank you in advance.