PGI not inlining function across files

Hi

I’m trying to get inlining to work with PGI. I use the

-Mextract=lib:lib.il,reshape,name:...

option to create an inline library and then want to inline using

-Minline=lib:lib.il,reshape

This works for all function except for one that I cannot get to inline at all. I tried all other options, e.g. setting totalsize=10000, but it just doesn’t want to inline this specific function, even though it is extracted correctly during the extract phase. This function is in a different file than from where it is called, so when I copy the function definition to the same file it actually inlines it correctly!

I’m not able to reproduce this at the moment, but are there any other options I can set during compilation to get more information about why it does or doesn’t inline a function? Or are there any limitations on inlining across files?

Thanks
Yannick

Hi Yannick,

This is the correct process for cross file inlining, so it’s unclear what’s wrong. Maybe it’s a level issue, which by default is 2 levels? Try adding “-Minline=lib:lib.il,reshape,levels:10” to increase the call depth to 10 levels.

-Mat

Hi Mat

I thought so as well, but I tried all the options of the -Minline flag I think (totalsize, maxsize, smallsize, levels). But as the function is correctly inlined when it is in the same file it shouldn’t be one of those. It’s a bit inconvenient that I cannot provide you a reproducible example atm… Is there any option I can set (besides Minfo=inline) to get more info on why the function was ignored for inlining?

No sorry.

Hi Mat

I found out that this is only a problem with the llvm code generator. If I compile the code with the

-Mnollvm

flag then the functions are inlined correctly across files. Is this the expected behaviour? Are there any other options I need to set to get inlining to work with the llvm code generator? Or is inlining (across files) just not supported anymore with llvm?

Thanks
Yannick

Hi Yannick,

No, LLVM is expected to have the same behavior as the non-LLVM back-end. Are you able to share the code? If so, I can write-up an issue report and have our engineers take a look.

Thanks,
Mat

Hi Mat

Thanks for the quick reply. I’m not able to share the code here, though I’ll try to see if I can reproduce it in a simple example. I’ll also check if we can make the code available to you directly somehow. I’ll update tomorrow.

Thanks
Yannick

Hi Mat

I prepared a small example to reproduce the behavior I observed, you can find in on my github:

There is a simple main function with a do loop, and in this loop there is a call to another function defined in the file utils.f90. The Makefile is copied from the PGI documentation to first extract an inline library and then inline the function into main.f90. Here is what I observed:

When compiling on PGI 18.5, I get the following output:

$ make
pgfortran -O2 -Minfo=all -Mllvm -Mextract=15 -o utils.il utils.f90
pgfortran -O2 -Minfo=all -Mllvm -Minline=utils.il -c main.f90
main:
     17, Memory set idiom, loop replaced by call to __c_mset4
     23, Loop not vectorized: may not be beneficial
         Loop unrolled 4 times
     24, pow2 inlined, size=2, file utils.f90 (12)
pgfortran -O2 -Minfo=all -Mllvm -c utils.f90
pgfortran -o myprog main.o utils.o

The function pow2 is thus correctly inlined. I can also remove the -Mllvm flag and get the same result.

When compiling with PGI 19.5/7/9 I get the following output:

$ make
pgfortran -O2 -Minfo=all -Mllvm -Mextract=15 -o utils.il utils.f90
pgfortran -O2 -Minfo=all -Mllvm -Minline=utils.il -c main.f90
main:
     17, Memory set idiom, loop replaced by call to __c_mset4
     23, Loop not vectorized/parallelized: contains call
pgfortran -O2 -Minfo=all -Mllvm -c utils.f90
pgfortran -o myprog main.o utils.o

The function pow2 is not inlined here! Once I set the flag ‘-Mnollvm’ for PGI 19.x, I get the same behavior as for PGI 18.5 though, with the function pow2 inlined correctly:

$ make
pgfortran -O2 -Minfo=all -Mnollvm -Mextract=15 -o utils.il utils.f90
pgfortran -O2 -Minfo=all -Mnollvm -Minline=utils.il -c main.f90
main:
     17, Memory set idiom, loop replaced by call to __c_mset4
     23, Loop not vectorized: loop count too small
         Loop unrolled 8 times
     24, pow2 inlined, size=2, file utils.f90 (12)
pgfortran -O2 -Minfo=all -Mnollvm -c utils.f90
pgfortran -o myprog main.o utils.o

Can you confirm this? Or is there something wrong with our PGI 19.x installation? I tried on two different machines of CSCS (https://www.cscs.ch/) and get the same result.


Thanks
Yannick

Thanks Yannick. I was able to reproduce the problem and have filed a report (TPR #27904). Looks like something changed in the compiler around the 18.7 release that prevents it from inlining when using the LLVM back-end. The older non-LLVM based compilers seem to be able to inline it so you can use them as a work around for now.

-Mat