I have a bunch of questions this time.
In attempting to compile an OpenACC code, I’m getting a message telling me that the compiler failed to translate the accelerator region due to an “Unexpected flow graph”. I think I understand in broad terms what this means, but I would appreciate a more specific explanation.
The same set of compiler outputs contains repeated mentions of
"Loop carried dependence due to exposed use of [array] prevents parallelization"
My first interpretation was that multiple threads were trying to update the same array, something that could be handled with atomics in CUDA. An alternative to atomics, which I implemented in the accelerator code, was to (1) create a special array just for the accelerator region, (2) zero it out before the OpenACC kernel, (3) perform a sum reduction over the special array after the kernel, and (4) add it back to the global array. However, that returns the same error message. So what is responsible for the message?
Lastly, there are several references to “Accelerator restriction: induction variable live-out from loop: i”. Some of these line numbers point to loops where the induction variable has been declared private; this suggests I don’t understand how the private declaration works, or what a live-out variable is. There are weirder instances of this message, though: sometimes it points to subroutine calls that don’t use that induction variable (edit to add: the subroutine is being inlined; I know OpenACC doesn’t handle subprogram calls right now). What’s going on there?
Thanks for any/all the advice you can give.