-fast compiler instruction is producing incorrect results.

Hi all,

I am just trying to use pgfortran to compile a program for which I have used gfortran to compile in the past. When I ran the program I noticed I was getting both incorrect and NaN output. This was using these compiler flags:

pgfortran -Minfo=ccff -fast

When I recompiled without the ‘-fast’ option, the answers come out corect and identical to the gfortran compiled code.

I am using the 14.3 release.

Any ideas what might be causing this would be great thanks. I am just going to try and change to using double precision rather than single to see if that makes any difference to the output.

EDIT: I replaced -fast with its individual components as follows:

-Mflushz -Mnoframe -Mlre -Mpre -Mcache_align -Mvect=simd -Munroll=c:1 -O2

I tested each flag and found that the last two, -Munroll=c:1 and -O2 each cause the incorrect values and NaN values, all the other flags are fine without either or both of those two. -O1 seems to work fine too.

Any help on the issue would be great, thanks.

Hi Harry,

Can you either post a small snippet of the code that demonstrates the problem, or point us at a copy of the code? I would like to check this and see if we have a possible bug.

Also, if you could tell us the platform of the system you are running on, that would also be helpful. In particular, I would like to know what platform the PGI driver thinks it is generating code for. Please run pgcc with the “-V” flag, and note the platform after the “-tp” option below (e.g. “sandybridge” in this case):

$ pgcc -V

pgcc 14.3-0 64-bit target on x86-64 Linux -tp sandybridge
The Portland Group - PGI Compilers and Tools
Copyright (c) 2014, NVIDIA CORPORATION. All rights reserved.

This will help us with trying to reproduce and isolate the problem.

Thanks in advance,

+chris

My pgcc -V and pgfortran -V output are exactly the same as yours, i.e. version 14.3, linux x86_64 and -tp sandybridge.

The system has 2 x Intel Xeon E5-2650 CPUs which are Sandybridge-E architecture as far as I know… it also has 64GB RAM and a Nvidia Tesla K20c. It is running Ubuntu 13.04.

I have narrowed down the incorrect results to a function called ‘applyBC’ not functioning correctly in one of my modules. This module can be seen in the dropbox link at the bottom of this post. If you compile as following:

gfortran -O3 -ffast-math -funroll-loops --param max-unroll-times=4 -fno-protect-parens fieldtest.f95

and run the output file, you will see the correct result for ‘applyBC’. If you compile the same file with:

pgfortran fieldtest.f95

you get the same result. However, trying the pgfortran with either -O2 or -Munroll=c:1 flags will change the applyBC result.

Hope this helps to work out why this is happening.

Here is the fieldtest.f95 file:

In the same file, I have also been testing out the filter function. (replace a%applyBC with a%filter in the test case) … this compiles fine with no flags, but if I use the -acc flag then it compiles but changes the behaviour of the filter function. Again, gfortran compiles and runs that function fine. I have a feeling it is to do with how the compiler deals with pointers and using them in calculations.

Thanks, Harry

UPDATE: The -acc flag is also causing the same incorrect result, meaning that I cannot compile any of my new sped up acc code because the -acc flag invalidates other areas of the program.

Is there a way to prevent the -acc flag operating on a certain portion of the file? E.g. get it to compile lines x-y without -acc, then the rest with? Just so I can have my applyBC and filter functions working normally…

Any help on this ASAP would be great as I am working to a tight deadline.

Thanks, Harry

Harry,

Thanks for the report. I have been able to reproduce the issue with your code here at PGI. I have filed a bug report on this today, and will let you know as soon as I hear something on this.

As for -acc, the compiler should only be generating accelerator code at points where there are ‘acc’ directives in your code. (via #pragma in C/C++, or !$ in Fortran.) Thus, the way to disable accelerator code generation in certain sections of your program is to simply remove the directives. One way to do this is to change !$acc to !$xxx at a given region, which makes Fortran treat it as just a comment instead of a directive.

Passing the -acc switch likely causes certain other optimizations to be enabled, so I suspect the problem with this code actually could be an optimization issue rather than an OpenACC issue.

Best regards,

+chris

Well, that was fast!

Our developer has just fixed this issue in the compiler, and the fix should be included in the PGI 14.4 update, due in a few weeks.

Best regards,

+chris

Thank you for sorting that out for me!

I’m afraid I have a deadline in a week’s time so 14.4 may come too late for me… my plan now is to move the OpenACC routines that I want into a seperate file/module, compile that with the -acc flag and then compile my other original modules without the optimisation flags before linking them.

(I’m assuming you couldn’t give me a prerelease version of 14.4 with those fixes)

Thanks again, Harry

14.4 is out now, and provides the correct answers.

regards,
dave