IPA not effective

I am compiling a Fortran 77 programme with pgf90, linking in a f90 library, which is however used only in a few subroutines.
If I use -Mipa, only 5 out of 101 subroutines are optimised with IPA.
How can I find out the reason why nothing is done on all the others?
I don’t see what would be special in the 5 routines that are being IPA-ed.

Hi bokumet,

“-Mipa” is equivalent to “-Mipa=const”, so the 5 routines may be the only ones that benefit from constant propagation. Try “-Mipa=fast,inline,libopt” to have the compiler perform more IPA optimizations. You can add “-Minfo=ipa” to have the compiler print which optimizations are being performed.

  • Mat

Thank you – the additional parameters led to about 40% of the subroutines being included in an IPA optimisation.

The not so good news: the run time was not improved, it is even slightly longer than with just -fast
By the way, -fastsse seems to be a bit slower than simple -fast (AMD Opteron 250 processor).

While “-fastsse -Mipa=fast,inline” generally gives the best performance, it’s not true for all applications. So it’s beneficial to try out a few combinations. Using “-fast” as your baseline, here’s some other flag sets to try:

  • -fast -Mipa=fast
    -fast -Mipa=inline
    -fast -Mvect=sse
    -fast -Munroll=n:4
    -fast -Mconcur (only for a multi-core system and run with the environment variable “NCPUS” set to the number of cores you have)
  • Mat

Thank you for these suggestions. concur I have not tried as there
are no free processors right now. The other optimisations I tried out and they were not really beneficial.

I am wondering if there are rules to keep in mind when writing a programme in order to enable these kind of optimisations to be beneficial. Or would it rather be so that only if the code is not written
well they would help? Anyway, I would like to know where I could learn more about what types of code structures benefit from what kinds of optimisations.

I would suggest first start by reading Chapter 2 of the PGI Users Guide to get a basic understanding of the optimization available. Next conduct a profile of your application at a medium optimization (like “-O2 -Mprof=lines”, or “-O2 -pg”) and use PGPROF to find any hotspots. See the PGI Tools Guide for more information about PGPROF and profiling.

Once you have identified the hotspots, those areas of your code that take the longest amount of time, compile these sections of code at various optimization levels. Add “-Minfo=all -Mneginfo=all” to have the compiler to display what optimaztions it’s performing and which ones it has tried but failed to apply. Pay particular attention to the neginfo messages since these indicate where source code changes might be helpful.

We do offer a two day course http://www.pgroup.com/support/training.htm which covers this topic if your interested.

Hope this helps,
Mat