OpenACC: -O2 and above gave wrong results

oukore · March 6, 2020, 10:16am

Hi, here’s an example that failed with -O2 and above. O1 and below were good.

The example is so short and easy that I don’t think extra explanation is necessary.

Tested on cc70 and cc60 cards. The results are the same.

I’ve already found a workaround (function g in the example), but this workaround isn’t practical in my project. I need the original function f to work too.

Thanks in advance.

// ubuntu 18.04
// pgc++ 19.10-0 LLVM 64-bit target on x86-64 Linux -tp haswell
// PGI Compilers and Tools

// pgc++ -acc -ta:tesla,cc70 -Minfo=accel acc.cpp -O2; ./a.out

#include <cstdio>

#pragma acc routine seq
inline void f1(float& a) { a = -1; }

#pragma acc routine seq
inline void f2(float& a) { a = -2; }

#pragma acc routine seq
inline void f(float& a, int flag) {
   if (flag == 1) f1(a);
   else f2(a);
}

#pragma acc routine seq
inline void g(float& a, int flag) {
   if (flag == 1) a = -1;
   else a = -2;
}

int main() {
   int n = 1;
   float outf, outg;
   #pragma acc parallel loop
   for (int i = 0; i < n; ++i) {
      float cf = 10, cg = 10;
      f(cf, 1); // wrong
      g(cg, 1); // right
      outf = cf;
      outg = cg;
   }
   printf("outf %lf\n", outf); // prints 10.0 with -O2 and above
   printf("outg %lf\n", outg); // always prints -1.0

   return 0;
}

MatColgrove · March 9, 2020, 8:45pm

Thanks stw. I can recreate the error here and have added an issue report (TPR #28202).

Looks like a regression when going from the 19.5 to 19.7 compilers. After inlining, the compiler is doing some optimization which is causing it to basically eliminate the if checks (since flag is always 1) and just set the out values to the end result. Though somehow in the case where there’s multiple levels of if statement, it’s using the initial value instead of the result value.

As a work around, you can disable inlining (-Mnoautoinline).

% pgc++ -ta=tesla -O2 test.cpp; a.out
outf 10.000000
outg -1.000000
% pgc++ -ta=tesla -O2 test.cpp -Mnoautoinline; a.out
outf -1.000000
outg -1.000000

-Mat

oukore · March 10, 2020, 7:22am

Hi Mat, thank you for your explaination.

My follow-up question is: if this acc routine was called inside a compute-intensive kernel, is this fix gonna cause more overhead compared to the correctly inlined version or the overhead shall be eliminated by other follwing procedures in compilation?

I understand the strictly correct answer might be tricky so I was just expecting a general reply. Thanks.

MatColgrove · March 10, 2020, 3:32pm

is this fix gonna cause more overhead compared to the correctly inlined version or the overhead shall be eliminated by other follwing procedures in compilation?

Most likely it will cause some performance loss. Probably not too bad, but you’d need to run it through a profiler to see the effect. Though as you note, comparing performance against code that gives incorrect answers is not very useful.

-Mat

aglobus1 · June 12, 2020, 1:44am

Fixed with HPC SDK 20.5, which is currently in EA https://developer.nvidia.com/hpc-sdk

Topic		Replies	Views
pgc++ -c -acc failed to compile with -O2 Legacy PGI Compilers	2	2548	August 26, 2019
Inlining with pragmas Legacy PGI Compilers	5	6222	April 23, 2014
Can -acc generate different numerical results ? Legacy PGI Compilers	1	1294	March 25, 2019
Routine has too high performance cost Legacy PGI Compilers	3	2448	September 17, 2016
-fast compiler instruction is producing incorrect results. Legacy PGI Compilers	8	7031	May 2, 2014
Bug with !$acc routine seq? Legacy PGI Compilers	2	2247	April 29, 2019
the code works with "-fast" compiler flag, but without "-fast" the compilation fails Legacy PGI Compilers	3	1012	September 2, 2019
Kernel code not generated because function not inlined Legacy PGI Compilers	1	2207	February 12, 2013
How to compile if functions defined elsewhere Legacy PGI Compilers	3	5404	June 13, 2018
compiler ask acc routine information for internal function Legacy PGI Compilers	12	20355	October 25, 2017

OpenACC: -O2 and above gave wrong results

Related topics