I’ve noticed that OpenACC + nvfortran is leading to some unexpected artifacts in my simulations for -O3 optimization but not -O2. The code is relatively long to track down the root of this difference by hand; we have dozens of OpenACC kernels. Is there a clean way to bisect where the issue could be coming from?
Right now, we are exploring turning on -O2 plus other options manually like -Munroll and such, but I’m not sure every difference between -O2 and -O3 is flippable via a flag (or is even documented, though some are).