"Private" arrays in ACC kernel.

Hi Mat,

This topic is continuation of this one. The problem was how PGI works with private arrays. So I had to convert private arrays to global arrays with an appropriate size. It dramatically increased memory requirements on CPU cite. So I’d like to extend those topic and ask:
What is the correct way to work with “private” arrays?

In my kernel I have hundred of private arrays. RHO, CPM, XXLV, XXLS is not full list. Those arrays are allocated on GPU with “acc data create” directive. Thus it takes much time for memory allocation and freeing. So I’d like to allocate big chunk of memory and split it among “private” arrays.


code example:
REAL, DIMENSION(ITS+ITE,JTS+JTE,KTS+KTE,4), TARGET :: TMP_BUF
REAL, DIMENSION(:,:,:), POINTER :: XXLS
REAL, DIMENSION(:,:,:), POINTER :: XXLV
REAL, DIMENSION(:,:,:), POINTER :: CPM
REAL, DIMENSION(:,:,:), POINTER :: RHO
...
!$acc kernels
!$acc loop independent collapse(2) gang vector(16)
do i=its,ite ! i loop (east-west)
do j=jts,jte ! j loop (north-south)
RHO => TMP_BUF(:,:,:,1)
CPM => TMP_BUF(:,:,:,2)
XXLV => TMP_BUF(:,:,:,3)
XXLS => TMP_BUF(:,:,:,4)

result in some errors. see /track/?id=263

         RHO => TMP_BUF(:,:,:,1)
         CPM => TMP_BUF(:,:,:,2)
         XXLV => TMP_BUF(:,:,:,3)
         XXLS => TMP_BUF(:,:,:,4)

!!$acc data create(LTRUE,LAMI,CPM,RHO,XXLV,XXLS, ACN) &
!$acc data create(LTRUE,LAMI,TMP_BUF, ACN) &
!$acc      present(CPM,RHO,XXLV,XXLS, &
....

!$acc kernels
!$acc loop independent  collapse(2) gang vector(16)
   do i=its,ite      ! i loop (east-west)
   do j=jts,jte      ! j loop (north-south)

result in

FATAL ERROR: data in PRESENT clause was not found on device 1: name=xxls

In the last case if I leave RHO, CPM, etc. in data copy section, PGI generate COPY statement for each var.

P.S. I got no answer for /track/?id=261

Hi Alexey,

I tried to recreate the error you logged in as #263, but was unable. Granted, I may not have been reproducing it correctly since I needed to put your file into the full application in order to get the dependent modules. Have you made changes to the dependent mod files? If so, can you wrap up a package with all the files? Any flag extra flags added to the default list other than -acc?

As for the second issue, at this point we don’t yet support F90 pointers in data clauses. It sounds like that we’re almost there, one small issue left, but they want to get in a lot more testing before adding it.

Are you able to use the “private” clause or was this the problem with your earlier post?

 !$acc loop independent collapse(2) gang vector(16) private(RHO,CPM,XXLV,XXLS)

I’ll take a look at #261 in a bit.

  • Mat

Hi Mat!

I’ve updated #263.

Refer to private variables… I have some problems with private variables in another example (see #261).

In this case (RHO, CMP…) I tried to solve the problem of allocating hundreds arrays. It takes ~50% of kernel execution time.