Compatibility between compilers

Hello,

My question concerns the use of the PGI compiler on CPU, not on GPU.

I have a scientific code in Fortran that is compiled with ifort, and now I’m using pgf90 (because of CUDA Fortran), but the final results are different. I want to produce the same results as ifort using pgf90.

The code is compiled with the flag -mieee-fp using ifort, and I tried the flag -Kieee using pgf90, but the results don’t match. I also tried -tp=px, but it didn’t have any effect. Is there another flag to make the results be the same. I need to use the ifort results as reference, so I don’t want to change the flags used with ifort.

The example below illustrates the differences. Although the difference is small in this case, these values are intermediary values in a numerical method that tests for convergence, and the final results differ by either hundreds or thousands.

System: Intel Core i7 7820x, Linux CentOS 7.5.1804
Compilers: Intel Fortran 19.0.5.281 (ifort), PGI Fortran 19.10-0 (pgf90)
Compilation: Intel → ifort -o prog.ifort prog.f90 -mieee-fp; PGI → pgf90 -o prog.pgf90 prog.f90 -Kieee

Thanks

Results:
Intel
val: 2914347.25
PGI
val: 2914347.50

program precision

real x1, x2, x3, val

x1 = 0.00999999977648258209228515625
x2 = 539.693969726562499999999999999999999999
x3 = 150.0

val = x1*x2*x3*3600.0

write(*,'(F30.20)') val

end program precision

Hi Henrique,

I guess I’m not convinced that this is a PGI issue. If I run your code with XLF with their strict ieee flag, the result agree with what pgf90 produces. Also, if you change the code slightly by breaking it into two operations, ifort gets the same results as the other two.

Is there reason why you’re not using double precision? It may help especially if your code is sensitive to precision issues.

% xlf --version
/opt/ibm/xlf/16.1.1/bin/.orig/xlf: 1501-216 (W) command option --version is not recognized - passed to ld
/opt/ibm/xlf/16.1.1/bin/.orig/xlf: 1501-294 (S) No input file specified. Please use -qhelp for more information.
% xlf test.f90 -qstrictieeemod -O0 ; a.out
** precision   === End of Compilation 1 ===
1501-510  Compilation successful for file test.f90.
  2914347.50000000000000000000


% cat test2.f90
program precision

real x1, x2, x3, val

x1 = 0.00999999977648258209228515625
x2 = 539.693969726562499999999999999999999999
x3 = 150.0

val = x1*x2*x3
val = val*3600.0
write(*,'(F30.20)') val

end program precision
% ifort -mieee-fp -O0 test2.f90 ; a.out
  2914347.50000000000000000000

Using the original example with double precision, both compilers agree:

% ifort -mieee-fp -O0 test2.f90 -r8 ; a.out
  2914347.37138269608840346336
% nvfortran test.f90 -O0 -r8 ; a.out
  2914347.37138269608840346336

-Mat

Hello MatColgrove,

The original code does not use double precision in this part, and if it is changed to double precision, the final results of the application are very different. Therefore I want to keep the code unchanged.

In the machine I’m using, I get the following results with the original code, as in the example: x1x2x3*3600.:

[code]
% ifort -o prog.ifort prog.f90; prog.ifort
2914347.50000000000000000000
% ifort -o prog.ifort prog.f90 -mieee-fp; prog.ifort
2914347.250000000000000000000
% pgf90 -o prog.pgf90 prog.f90; prog.pgf90
2914347.50000000000000000000
% pgf90 -o prog.pgf90 prog.f90 -Kieee; prog.pgf90
2914347.50000000000000000000

I’d like to know why the case with the flag -mieee-fp using ifort is the only case with different result, and how I could get the same result with pgf90.

I separated the computation into two parts, as you mentioned, and the results are all equal to 2914347.50, independently of either compiler or flags.

Please, let me know if there is a way for pgf90 to produce 2914347.25.

Thanks

I looked at the assembly code ifort generates. Looks like they are using the x87 FPU registers which are 80-bits. So while the data types are still 64-bits, the multiples are being performed in 80-bit, thus giving you slightly differing results.

It gets consistent answers when breaking the computation into two parts since this adds an extra store back into 64-bits, before doing the final multiply by 3600 back in 80-bits.

Please, let me know if there is a way for pgf90 to produce 2914347.25.

No sorry. We haven’t used the x87 FPU for a long time.

Note that you will see this issue on any other architecture that doesn’t have a 80-bit FPU and explains why XLF gives the same results as pgf90. This includes NVIDIA GPUs which performs all single-precision computation in 64-bits.

No sorry. We haven’t used the x87 FPU for a long time.

Note that you will see this issue on any other architecture that doesn’t have a 80-bit FPU and explains why XLF gives the same results as pgf90. This includes NVIDIA GPUs which performs all single-precision computation in 64-bits.

I saw that the pgf90 compiler includes the flag -pc that can be set to 80, but I tried compiling as below and it didn’t work.

% pgf90 -o prog.pgf90 prog.f90 -tp=px -pc=80; prog.pgf90
2914347.50000000000000000000

Thanks for you help.

The precision control flag, “-pc”, was only valid for the old 32-bit ABI compilers which was the only place where we supported x87… We moved to pure 64-bit computation when moving the 64-bit ABI.

I’ll make a note since “-pc” shouldn’t be in the help messages any longer since we stop supporting 32-bits a long time ago. It’s not in the User’s Guide, but appears to have been missed in the help messages.

I’ll make a note since “-pc” shouldn’t be in the help messages any longer since we stop supporting 32-bits a long time ago. It’s not in the User’s Guide, but appears to have been missed in the help messages.

Besides the compiler help, I also found information about the -pc flag in this webpage:
https://www.pgroup.com/resources/docs/19.10/x86/pgi-ref-guide/index.htm