I’ve noticed that OpenACC + nvfortran is leading to some unexpected artifacts in my simulations for
-O3 optimization but not
-O2. The code is relatively long to track down the root of this difference by hand; we have dozens of OpenACC kernels. Is there a clean way to bisect where the issue could be coming from?
Right now, we are exploring turning on
-O2 plus other options manually like
-Munroll and such, but I’m not sure every difference between
-O3 is flippable via a flag (or is even documented, though some are).