I’ve noticed that OpenACC + nvfortran is leading to some unexpected artifacts in my simulations for -O3
optimization but not -O2
. The code is relatively long to track down the root of this difference by hand; we have dozens of OpenACC kernels. Is there a clean way to bisect where the issue could be coming from?
Right now, we are exploring turning on -O2
plus other options manually like -Munroll
and such, but I’m not sure every difference between -O2
and -O3
is flippable via a flag (or is even documented, though some are).