This is loosely related to "OpenACC" deepcopy support in current nvc releases? It works, but is it supported? - similar example, different question. I noticed some (to me) weird behavior with struct pointer member attach / detach. It seems like nvc only creates attach / detach statements when the struct itself is explicitly copied, not when that happens implicitly.
I am using NVHPC SDK 21.9 on Ubuntu 20.04 and the following code:
//deepcopy_test.c
#include <stdlib.h>
#include <stdio.h>
typedef struct points {
float* x; float* y;
int n;
} points;
int main() {
points p;
p.n = 1000;
p.x = ( float*) malloc ( sizeof ( float )*p.n );
p.y = ( float*) malloc ( sizeof ( float )*p.n );
// working pragma:
//#pragma acc data copy(p) copy(p.x[:p.n]) copyin(p.y[:p.n])
// generates implicit copy(p) but no attach for p.x, p.y:
#pragma acc data copy(p.x[:p.n]) copyin(p.y[:p.n])
{
#pragma acc parallel loop
for (int i=0; i<p.n; ++i ) p.x[i] += p.y[i];
}
printf("all done, exiting\n");
}
Compiling this yields:
$ nvc -Wall -O3 -g -Minfo=accel -acc=strict deepcopy_test.c -o deepcopy_test && NVCOMPILER_ACC_NOTIFY=3 ./deepcopy_test
main:
19, Generating copy(p.x[:p.n]) [if not already present]
Generating Tesla code
21, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
19, Generating implicit copyin(p) [if not already present]
Generating copyin(p.y[:p.n]) [if not already present]
upload CUDA data file=[...]/deepcopy_test.c function=main line=19 device=0 threadid=1 bytes=4000
upload CUDA data file=[...]/deepcopy_test.c function=main line=19 device=0 threadid=1 bytes=4000
upload CUDA data file=[...]/deepcopy_test.c function=main line=19 device=0 threadid=1 variable=p bytes=24
launch CUDA kernel file=[...]/deepcopy_test.c function=main line=19 device=0 threadid=1 num_gangs=1024 num_workers=1 vector_length=128 grid=1024 block=128
Failing in Thread:1
call to cuStreamSynchronize returned error 700: Illegal address during kernel execution
Notice how the two pointer members are created and copied, but not attached. While with the other pragma line which contains an explicit copy statement for p as well everything works:
$ nvc -Wall -O3 -g -Minfo=accel -acc=strict deepcopy_test.c -o deepcopy_test && NVCOMPILER_ACC_NOTIFY=3 ./deepcopy_test
main:
19, Generating copy(p) [if not already present]
Generating copyin(p.y[:p.n]) [if not already present]
Generating copy(p.x[:p.n]) [if not already present]
Generating Tesla code
21, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
upload CUDA data file=[...]/deepcopy_test.c function=main line=19 device=0 threadid=1 variable=p bytes=24
upload CUDA data file=[...]/deepcopy_test.c function=main line=19 device=0 threadid=1 variable=.attach. bytes=8
upload CUDA data file=[...]/deepcopy_test.c function=main line=19 device=0 threadid=1 bytes=4000
upload CUDA data file=[...]/deepcopy_test.c function=main line=19 device=0 threadid=1 variable=.attach. bytes=8
upload CUDA data file=[...]/deepcopy_test.c function=main line=19 device=0 threadid=1 bytes=4000
launch CUDA kernel file=[...]/deepcopy_test.c function=main line=19 device=0 threadid=1 num_gangs=1024 num_workers=1 vector_length=128 grid=1024 block=128
upload CUDA data file=[...]/deepcopy_test.c function=main line=22 device=0 threadid=1 variable=.detach. bytes=8
download CUDA data file=[...]/deepcopy_test.c function=main line=22 device=0 threadid=1 variable=p bytes=24
download CUDA data file=[...]/deepcopy_test.c function=main line=22 device=0 threadid=1 bytes=4000
all done, exiting
I’m not sure if this is a bug or not? Mat Colgrove already mentioned that the spec forbids putting the same variable in multiple copy statements to avoid ambiguity and that gcc in particular does not like it. Maybe I should stop trying it anyway? ;) Reading the OpenACC 2.7 spec I haven’t found that yet (have not searched specifically for it though), if anyone knows specifics, please share. For copy clauses though, copy(p.x[:p.n])
is a pointer reference? Hence it should trigger an attach? Otherwise, how would I reference the memory that got created?
Any insights are highly appreciated.