pgf77 performance issue 6.0 vs 5.2

I have some legacy f77 code that I’ve been testing the 5.2 versus 6.0 compiler. The system I am running on is a Intel Pentium 4 running Suse 9.3, Linux 2.6.11. Doing a

pgf77 -V

I get 5.2-4 and 6.0-8 respectively. The CPU times I get for two different programs on 5.2 are;
238.951 and 221.501

For the 6.0 run for the same programs I get;
268.297 and 245.096

For both cases the compiler options concerning optimization are;
-fast -fastsse -Miniline

I have tested other programs and have gotten consistent results with 6.0 producing slower code than 5.2. This also appears to be true on our 64-bit Opteron systems as well. However, I would be very happy to move to 6.0 if I can resolve this problem. That level corrects execution-time problems in other programs that are compiled with 5.2 on our Linux 2.4 systems.

Hi akushner,

Can you post the 5.2 and 6.0 runtimes for the following flagsets:

  1. -fastsse
  2. -fastsse -Mipa=fast,inline

I want to see if the regression is caused by inlining or by some other optimization. Also, I want to see what happens if you use IPA inlining instead.

I suspect that a routine that was being inlined is not longer. To view what subroutines are being inlined add “-Minfo=inline” to the compilation line and compare the output between the 5.2 and 6.0.

Note that “-fast” is part of “-fastsse” so is not needed.

  • Mat


Thanks for the reply. I’ll run the tests and post the info when I get back to the office on Monday.

We have never used -Mipa because we get the message from the link phase (I can’t recall the exact message) that it was turned off because of not having a main or something. The entry to the programs is through a C front end, so I thought that caused it to be turned off (we have to use -Mnomain). If we could get Mipa to work that would be great.

Also, thanks for the note about -fast and -fastsse. I thought I saw there were some flags turned on by -fast that were not turned on -fastsse, but I may have misread the manual.

IPA’s most likely complaining that it’s missing some IPA information. If you compile the C portion of the code with IPA as well, the message should go away. Also, you can try “-Mipa=fast,inline,safe”. “safe” tells pgipa that you think it’s safe to go ahead with the IPA recompilation even if your missing some information.

  • Mat

Unfortunately we do not have PGI’s C compiler licensed. So, the pertinent output for both the 5.2 and 6.0 compiles with the Mipa and Minfo flags as you suggested looked like;

1, extracting subprogram for IPA, size 35
1, extracting subprogram for IPA, size 28
1, extracting subprogram for IPA, size 52
1, extracting subprogram for IPA, size 22
IPA inhibited: no main routine

So, I don’t think Mipa is a factor. The runtime table for the 4 runs is;
5.2 -fastsse -Mipa=fast,inline,safe; 239.535u 1.054s 4:12.18 95.4% 0+0k 0+0io 3pf+0w
5.2 -fastsse; 240.561u 1.128s 4:16.59 94.1% 0+0k 0+0io 3pf+0w
6.0 -fastsse -Mipa=fast,inline,safe; 270.837u 0.981s 4:45.92 95.0% 0+0k 0+0io 2pf+0w
6.0 -fastsse ; 270.509u 0.620s 4:46.15 94.7% 0+0k 0+0io 0pf+0w

While several other applications I’ve tested have shown that 5.2 object code is faster than the equivalent 6.0 code, I did test a different application this morning that has the 6.0 code being 10% faster than the 5.2 code.


Hi Andy,

It’s not inlining, so the next step is to start breaking out the individual components of “-fastsse” to determine which optimization is causing the slow-down. The most likely culprits are “-Mlre”, “-Msmart”, “-Mvect=sse”.

Try running with:

“-fastsse -Mnosmart”, “-fastsse -Mnolre”, “-fastsse -Mnovect”.

If those aren’t it, start at “-O2” and then progressively add in the following optimizations: “-O2 -Munroll=c:1 -Mnoframe -Mlre -Msmart -Mvect=sse -Mscalarsse -Mcache_align -Mflushz”.

Also, can you send us your code (to or is it available on the web? We should release 6.1 this week and I’d like to see if the regression still occurs. If it does, then I’ll file a technical problem report (TPR) to have the regression fixed.