Hi all,
While doing the bookkeeping of device data on my OpenACC Fortran code, I was wondering if there is a way to get the sum of the storage for all allocated device data. From NVIDIA’s extension to the OpenACC runtime API, I was able to nicely use acc_get_memory
and acc_get_free_memory
to get the total device resource usage, and acc_bytesalloc
to get the “total bytes allocated by data or compute regions”. However, I miss a way of getting the contribution of OpenACC arrays which are mapped to CUDA device arrays. These are listed when I use acc_present_dump
, but I miss a neat way of getting the sum of their memory footprint.
I am interested in getting this sum because from this post I realize that using acc_get_memory
and acc_get_free_memory
may result in an unexpected reporting if I do not disable the runtime memory manager?
Here is a small example illustrating what I am looking for:
! cat test.f90
program p
use accel_lib
real, device, allocatable, dimension(:,:,:) :: a_d
real, allocatable, dimension(:,:,:) :: a_h
allocate(a_d(10,10,10),a_h(10,10,10))
call acc_map_data(a_h,a_d,sizeof(a_h))
call acc_present_dump ! will report the mapped device array
print*,acc_bytesalloc() ! will print zero
end program p
nvfortran -acc -cuda test.f90 && ./a.out
Present table dump for device[1]: NVIDIA Tesla GPU 0, compute capability 6.1, threadid=1
host:0x39b8820 device:0x7f41c7e00000 size:4000 presentcount:0+1 line:-1 name:(null)
0
Thank you!