keeping array on the GPU

Dear All,
I wrote a simple example (see below) where my goal is to minimize the memory flow from/to the GPU. After creating the data (x in init) all the rest computation is done on the GPU (in subroutine sumit).

Is it correct? i.e, do (x, y) stay all over the computation on the GPU?

How can I have the information regarding memory flow (from cpu to gpu and vice versa)?

!----------------- MY code -------------------------------------------------
module modules

contains

subroutine sumit(n,x,y)
integer n
real, dimension(n) :: x,y
!$acc kernels present(X,Y)
y=x+y
!$acc end kernels
return
end

subroutine init(n,x,y)
real, dimension(n) :: x,y
!$acc kernels
x(:)=1.
y(:)=x(:)
!$acc end kernels
return
end

end module modules

program main
use modules
integer, parameter :: n = 2**10
integer, parameter :: nl = 10
real, dimension(n) :: x, y

!$acc data create(x,y)
call init(n,x,y)

!acc host_data use_device(x,y)
do l=1,nl
call sumit(n,x,y)
enddo
!acc end host_data
!$acc update host(y)
!$acc end data

print *, y(1)
end program

Hi Barak,

Is it correct? i.e, do (x, y) stay all over the computation on the GPU?

Yes, it’s correct.

How can I have the information regarding memory flow (from cpu to gpu and vice versa)?

As of yet, there’s no tool that maps the data movement. You’ll need to do this manually.

  • Mat