Questions about local arrays in loops

When compiling the following code (using PGI 15.1 compiler)

#ifdef _OPENACC
#pragma acc kernels
#pragma acc loop independent
#endif
   for( int i( 0 ); i < 100; ++i )
   {
#ifdef _OPENACC
#pragma acc loop independent
#endif
      for( int j( 0 ); j < 100; ++j )
      {
         const float v[3] = { 1., 1., 1. };
      }
   }

I get the following message from the compiler

pgc++ -fast -Minfo=all -acc -ta:nvidia array.cc
“array.cc”, line 21: warning: variable “v” was declared but never referenced
const float v[3] = { 1., 1., 1. };
^

main:
7, Generating copyout(v[:])
Generating Tesla code
14, Loop is parallelizable
19, Loop is parallelizable
Accelerator kernel generated
14, #pragma acc loop gang /* blockIdx.y /
19, #pragma acc loop gang, vector(128) /
blockIdx.x threadIdx.x */

Why does the compiler issue a copyout statement for a loop local variable?

When leaving out the independent clause from the inner loop, the compiler gives the following messages

main:
7, Generating copyout(v[:])
Generating Tesla code
14, Loop is parallelizable
19, Loop carried reuse of v prevents parallelization
Inner sequential loop scheduled on accelerator
Accelerator kernel generated
14, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */

Why does this happen? Doesn’t the compiler generate a copy of v for each thread?
If instead of an array a scalar type is used, the problem doesn’t occur. Why?

Thank you.

L

Hi L,

My best guess is that the compiler is hoisting this array out of the loop since it’s invariant. Though, it shouldn’t in this context.

Can you please send a reproducing example to PGI customer service (trs@pgroup.com)?

Thanks,
Mat