Dear OpenACC community,
we have a huge code which was developed using the PGI compiler 17.10 and OpenACC on a P100 cluster. With this setup and the O3 option it generated correct results.
On a different cluster we are now using the the PGI compiler 18.5 and the V100.
However, the program does not generate correct results anymore.
So far we tracked it down to the optimization flag, if we compile and link with -O0 we get also get correct results using the PGI compiler 18.5. But starting from -O1 up to -O3 the results are incorrect.
At same point the also algorithm starts to print out “NaN”.
We compile and link with these flags:
PGIOPTS=-Mcuda=9.0,ptxinfo PGIOPTS+=-Mpreprocess PGIOPTS+=-Mlarge_arrays -mcmodel=medium PGIOPTS+=-ta=tesla:cc70 PGIOPTS+=-O0 PGIOPTS+=-mp PGIOPTS+=-acc -Minfo=accel -Minfo
We also tried the -fast option for compiling and linking but it still generates wrong results.
Is there a way to to debug it and find out what happens?
Thank you for your help