I am currently trying to profile my code to get more information on how to best accelerate it. I have access to Tau and I can and have built a Tau-profiled version of my code, but the auto-instrumentation can only hit the outer loops. Since my code often involves loops inside loops inside &c., I thought I’d move to pgcollect before trying to hand-instrument my code.
I have used pgcollect before to look at small examples and programs and I love it, but never on the rather massive amount of code that I’m looking at and trying to speed up. I did successfully compile the entire codebase with -Minfo=ccff but when I tried to run the program with pgcollect, the system hanged or stalled out. I could see the pgsampt task in htop, but it just sat there at 0% CPU for five or more minutes.
So, I thought perhaps my code was just too big or the like. Thus, I wondered, is there a way to just compile in the CCFF information (and, perhaps, -Mpfi/pfo) for a subset of my code and then run pgcollect? In truth, I know the three or four Fortran files out of hundreds that I need to focus on thanks to Tau, so it’s sort of a waste to do line-by-line profiling of the entire code base.