Do you have a complete reproducing example you can provide? In particular, I’d want to see how you are creating the “syn_” structure on the device. You may also consider adding a “present(syn_)” on the parallel loop so the compiler doesn’t attempt to implicitly copy it. (Of course this is assuming that you’re copying “syn_” at a higher level).
Thank you for your answer. This code is a part of big project and we plan to move gradually the most parallel parts of it to OpenACC. This code is one of firsts steps in this direction. Respectively, it is not easy to extract it as self-containig separate example.
I do not create syn_ on the device explicitely. I expected that OpenACC would transfer it to device (knowing its size from the loop limits [0, nsynapses_]), perform the processing, and destroy it. Am I wrong? Shoud I transfer it explicitely?
I understand that this code is not very efficient but it is rather first test step…
Early versions of the OpenACC standard left it open for compiler implementation on how to handle implicit copying of pointers. While I don’t use GNU myself nor know their interpretation, I would expect they are following the current standard, which clarifies the behavior, where pointers implicitly use the “no_create” clause which will create a local private copy of the pointer with it’s value initialized from the host address. I highly doubt they will implicitly copy the full array using the loop bounds.
Instead, try adding “copyin(syn[0:nsynapses]” to the parallel directive to make the copy explicit and in line with the current OpenACC standard.
error: “snn_library::calculation_backend::AdditiveSynapseGroupOpenAcc::syn_” is not a variable in map clause
16 | #pragma acc parallel loop reduction(+:ret) copyin(syn_[0:nsynapses_])
while syn_ is declared as member variable in AdditiveSynapseGroupOpenAcc
My guess is that this is a limitation in GNU’s support of OpenACC. I recall that they had issues with aggregate types, which may or may not be supported in later versions. You might try g++ 11, or install nvc++, part of the NVIDIA HPC SDK, which I’ll be better able to help with since I’m part of the NV HPC compiler team.
while syn_ is declared as member variable in AdditiveSynapseGroupOpenAcc
In this case, you also need to copy the this pointer over to the device. What I typically do, is in the class constructor, add the enter data directive which copies the this pointer as well as it’s data members.