PGI Profiler Not Showing All MPI Processes

Dear Support,

I am running PGI 11.8 and I am having an issue with the profiler not showing all of my MPI processes, it only shows profiling for a single process P 0.

I am running on RedHat Linux 5 64-bit. I am using my own compiled MPICH2 built with debug flags. The code is mixed language (Fortran, C++, and C). There’s no OpenMP stuff, all MPI, so I launch multiple processes single thread each. I am running on a dual socket Intel Westmere, so total of 12 cores on the system.

I compile the code with the below flags as I looked it up in the PGI profiler user guide:

-Mpfo -Mprof=lines -Minfo=ccff -pgcpplibs -w -pc 64 -Mnoopenmp -Kieee -Mpreprocess -Mbyteswapio -Bstatic -Mextend

C and C++:
-Mpfo -Mprof=lines -Minfo=ccff -w -Bstatic

I then run my MPI application as below, I only run on 12 local cores, no remote nodes involved:
mpirun -np 12 -hostfile nodes app.exe

Top command on Linux shows 12 app.exe processes while running.

When the application finishes successfully, a single file “pgprof.out” gets produced.

I launch the profiler as below:
pgprof -exe app.exe

In the “Parallelism” tab, it only shows “P 0” and I am expecting to see 12 processes instead.

I also tried adding the “mpich2” option as part of the “-Mprof=lines,mpich2” as the profile user guide suggested but that didn’t seem to help either. I also tried it with “”-Mprof=time,mpich2" and “-Mprof=func,mpich2” and it was still showing one process only.

Any idea what might be wrong here?

Thank you for your help.

I suspect that the problem results from using -Mpfo in combination with -Mprof=lines. The instrumentation that the compiler inserts in your program for -Mpfo will likely conflict with the instrumentation for -Mprof=lines.

You should definitely use -Mprof=lines,mpich2 or -Mprof=time,mpich2 to get MPI message profiles.

Can you try your performance experiment without -Mpfo?

Please post whether or not this works. Thanks.

Thank you for the quick reply.

I tried your suggestion and got rid of -Mpfo and it’s still only showing P 0. I tried it with both “-Mprof=lines,mpich2” and “-Mprof=time,mpich2”.

Also without -Mpfo I am not getting the compute intensity in the profiler which is the main reason why we are doing the profiling as we are interested to identify hotspots with high compute intensity to be ported to GPU via the accelerator directives.

When I am using both “-Mpfo -Mprof=lines,mpich2” I do get useful feedback regarding the compute intensity, however I am not pretty confident of how accurate what the profiler is reporting in that term as I only see one processor being profiled.

Any other suggestions to identify hotspots with high compute intensity for an MPI application?

Thank you for your help.

Sorry that suggestion didn’t help.

It sems like the MPI portion of the profiling isn’t operating. There are at least a couple of potential causes for this.
(1) your version of MPICH-2 was not built to use the PMPI interface
(2) your application isn’t linked such that the PGI MPI profiling lib is included successfully

One thing you could try is to set the environment variable MPIDIR to point to your MPI installation directory prior to linking your application. This variable’s value should be set to the path of the MPI directory that contains “bin”, “lib” and so on. For example (bash):

$ export MPIDIR=/opt/pgi/linux86-64/2011/mpi2/mpich

only use your mpich2 directory, of course.

In order to diagnose this further, we will need more information. Would you please send the following to, with a note that this should be forwarded to tools engineering and reference this posting?

  • output of your link line using pgfortran -v
  • a copy of your pgprof.out file
  • copy-and-paste the command you are using to launch your mpi program
  • if you could provide a copy of your executable, that would help as well, although I understand that is often not possible.


Dear Don,

I’ve sent an email with the details requested to

I will also post the details here for everyone’s reference:

I tried setting the MPIDIR environment variable and that didn’t help, it still shows only one processor in the profiler.

Also to isolate if it’s an MPI build problem, I tried using the MPICH2 version that came bundled with the PGI compiler CDK (i.e. /opt/pgi/linux86-64/2011/mpi2/mpich) as I am assuming it is built with all what is needed to hook with the profiler. Even with that version of MPICH2 I am only seeing one processor being profiled. So to me it seems that the problem is not related to the MPICH2 build as PGI’s version of MPICH2 is showing the same behavior.

Below is the output from the link line using “-v” as requested, for reference we use CMake to do our builds.
I’ve stripped out the few hundred source files being linked and just replaced it with a few for convenience and confidentiality (i.e. file1.f.o, file2.f.o, etc…). The PGI 11.8 compiler CDK is installed in /scratch/pgi11.8

Linking Fortran executable app.exe
mpif90 for MPICH2 version 1.2.1p1
mpif90 for MPICH2 version 1.2.1p1
pgf90-Info-Switch -Mpfo -Mpfo forces -O2

/usr/bin/ld /usr/lib64/crt1.o /usr/lib64/crti.o /scratch/pgi11.8/linux86-64/11.8/lib/trace_init.o /usr/lib/gcc/x86_64-redhat-linux/4.1.2/crtbeginT.o
/scratch/pgi11.8/linux86-64/11.8/lib/initmp.o /scratch/pgi11.8/linux86-64/11.8/lib/init_pgpf.o /scratch/pgi11.8/linux86-64/11.8/lib/f90main.o -m
elf_x86_64 -dynamic-linker /lib64/ /scratch/pgi11.8/linux86-64/11.8/lib/pgi.ld
-L/scratch/pgi11.8/linux86-64/2011/mpi2/mpich/lib -L/scratch/pgi11.8/linux86-64/2011/mpi2/mpich/lib
-L/scratch/pgi11.8/linux86-64/11.8/lib -L/usr/lib64 -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2 -Bstatic -Bstatic
-lmpichcxx -rpath
-lmpichf90 -rpath /scratch/pgi11.8/linux86-64/2011/mpi2/mpich/lib -lmpichf90 -lmpich -lopa -lpthread -lrt -rpath
/scratch/pgi11.8/linux86-64/11.8/mpi2/mpich/lib -rpath /scratch/pgi11.8/linux86-64/11.8/mpi2/mpich/lib -rpath /scratch/pgi11.8/linux86-64/11.8/lib
-o app.exe
-L/scratch/pgi11.8/linux86-64/11.8/mpi2/mpich/lib -lfmpich -lmpichf90 -L/scratch/pgi11.8/linux86-64/11.8/mpi2/mpich/lib -lfmpich -lmpichf90
-lpgnod_prof -lmpich -lpthread -lzceh -lgcc_eh --eh-frame-hdr -lstdz -lCz /scratch/pgi11.8/linux86-64/11.8/lib/nonuma.o -lpgmp -lpthread -lpgftnrtl
-lpgftnrtl -lpgf90 -lpgf90_rpm1 -lpgf902 -lpgf90rtl -lpgftnrtl -lnspgc -lpgc -lrt -lpthread -lm -lgcc -lgcc_eh -lc -lgcc -lgcc_eh -lc
/usr/lib/gcc/x86_64-redhat-linux/4.1.2/crtend.o /usr/lib64/crtn.o
/scratch/pgi11.8/linux86-64/11.8/mpi2/mpich/lib/libmpich.a(simple_pmi.o): In function `PMII_Connect_to_pm':
/home/sw/cdk/cdk/mpich2-1.2.1p1/mpich2-1.2.1p1/src/pmi/simple/simple_pmi.c:1088: warning: Using 'gethostbyname' in statically linked applications
requires at runtime the shared libraries from the glibc version used for linking
[100%] Built target app.exe

What is the “PGI MPI profiling lib” called and where is it located? Does the linking details show that it is being linked to? If it’s not showing, is it possible to link to it manually?

Unfortunately we won’t be able to provide a copy of the pgporf.out since it has the structure of the application’s source code tree which is confidential.

Commands used to launch the MPI program and profiler:

[user@node001 test]$ where mpd
[user@node001 test]$ where mpiexec
[user@node001 test]$ mpd &
[user@node001 test]$ mpiexec -n 12 ./app.exe -m ./file.input | tee output
[user@node001 test]$ pgprof -exe ./app.exe

Compilation flags being used:

-v -Mpfo -Mprof=lines,mpich2 -Minfo=ccff -pgcpplibs -w -pc 64 -Mnoopenmp -Kieee -Mpreprocess -Mbyteswapio -Bstatic -Mextend -traceback

C and C++:
-v -Mpfo -Mprof=lines,mpich2 -Minfo=ccff -w -Bstatic -traceback

Thank you for your help.