Here’s an interesting one.
First, the code. It’s from a bigger code hence the line numbers not making sense. 460 corresponds to the data copyin line.
I’m using PGCC v12.8.
#pragma acc data copyin(a[0:n])
{
#pragma acc parallel loop reduction(+:z)
for (i=0;i<n;i++){
z += a[i];
}
}
And the compilation output:
460, Generating copyin(a[0:n])
464, Accelerator kernel generated
464, CC 1.0 : 7 registers; 48 shared, 32 constant, 0 local memory bytes
CC 2.0 : 12 registers; 0 shared, 72 constant, 0 local memory bytes
465, #pragma acc loop gang, vector(256) /* blockIdx.x threadIdx.x */
464, Generating copyin(a[0:n])
Generating compute capability 1.0 binary
Generating compute capability 2.0 binary
So, some questions:
- Could the compiler output be changed so that it tells me it’s really done the reduction? If I omit the reduction clause, the compiler realises I wanted a reduction and inserts one for me AND tells me it’s happened. I presume it’s working as I get the following from the runtime diagnostic
464: region entered 10 times
time(us): total=4,798 init=3 region=4,795
kernels=1,525
w/o init: total=4,795 max=1,253 min=391 avg=479
464: kernel launched 10 times
grid: [4096] block: [256]
time(us): total=1,395 max=143 min=138 avg=139
465: kernel launched 10 times
grid: [1] block: [256]
time(us): total=130 max=13 min=13 avg=13
I presume this means that the kernel launched at 465 is really a reduction function/kernel inserted by the compiler.
-
Can I disable the automatic reduction detection? Sometimes, when I want to show someone the reduction clause working, it’d be nice to be able to show it not working and how this might cause the wrong answer due to overwrites of z etc (or z being automatically privatized).
-
Looking at the runtime output, I can see only one copyin is done:
460: region entered 10 times
time(us): total=13,067 init=3 region=13,064
data=6,335
w/o init: total=13,064 max=3,933 min=1,009 avg=1,306
But this is confusing as the compiler says it does a copyin at 464 as well as at 460. Is this just a compiler output bug?
Cheers,
-Nick.