a question

I have a question that may be only the developer of the compiler can answer.
in
#pragma acc kernels loop
for( i = 0; i < n; ++i ) r = a*2.0f;

it will be executed parallelizedly in the GPU. So at the beginning, host should copy array a[n] to the device. But how can the compiler know the size of a since a is point type?

host should copy array a[n] to the device. But how can the compiler know the size of a since a is point type?

It will try to use the loop bounds if possible. If it can’t determine the size, it will issue a feedback message (-Minfo=accel) stating the the size can’t be determined and the user will need to add a copy clause indicating the number of elements to copy.

  • Mat

Thanks a lot ,Mat