Using Shared Memory

Hello,

I want to use shared memory in acc region, it would be nice if I have an example. I think there is no “shared” subclause, like “private”.

Thank you,
Sayan

Hi Sayan,

The OpenACC “cache” directive specifies which arrays should be put into shared memory. The cache directive will be available in the 12.7 release.

Here’s an example:

void
testgpu(float *restrict *restrict b,float *restrict *restrict a,int *restrict ix,int *restrict iy,float *restrict t, int c)
{
    int i,j;
#pragma acc kernels copyin( a[0:N][0:M]), copyout(b[1:N-2][1:M-2])
    {
#pragma acc cache(a)
#pragma acc loop
        for( j = 1; j < M-1; ++j ){
            for( i = 1; i < N-1; ++i ){
                b[i][j] = 0.5f*(a[ix[i-1]][iy[j-1]] + a[i][j])/c;
            }
        }
    }

}
  • Mat

Thank you Mat, does the compiler internally copy data into shared memory on certain cases, or acc directives at present only load data to/from registers or global memory?

Hi Sayan,

Sorry, but I’m not entirely sure I understand the question. Do you mean does the compiler automatically find opportunities for shared memory usage or will shared memory only be utilized if the cache clause is explicitly set by the user?

Right now the PGI Accelerator Model with automatically use shared memory however OpenACC will require the user to explicitly set the cache directive. This is a difference between the two models.

  • Mat

Yes, my question was whether the compiler will find opportunities to use GPU shared memory. Thank you again for distinguishing the two models.