Open MPI + PGI 8.04 compilation failure

Hello,

I’m trying to compile Open MPI-1.3 with PGI 8.04 compilers on Red Hat Enterprise Linux 4 U5. Initially it failed with:

libtool: compile: mv -f “file.o” “.libs/file.o”
/bin/sh …/…/…/libtool --tag=CXX --mode=link /opt/pgi/linux86-64/8.0-4/bin/pgCC -DNDEBUG -O4 -export-dynamic -o libmpi_cxx.la -rpath /opt/openmpi/1.3/pgi/lib mpicxx.lo intercepts.lo comm.lo datatype.lo win.lo file.lo …/…/…/ompi/libmpi.la -lnsl -lutil -lpthread
libtool: link: /opt/pgi/linux86-64/8.0-4/bin/pgCC -shared .libs/mpicxx.o .libs/intercepts.o .libs/comm.o .libs/datatype.o .libs/win.o .libs/file.o -Wl,–rpath -Wl,/root/RRI/openmpi-1.3/ompi/.libs -Wl,–rpath -Wl,/root/RRI/openmpi-1.3/orte/.libs -Wl,–rpath -Wl,/root/RRI/openmpi-1.3/opal/.libs -Wl,–rpath -Wl,/opt/openmpi/1.3/pgi/lib -L/root/RRI/openmpi-1.3/orte/.libs -L/root/RRI/openmpi-1.3/opal/.libs …/…/…/ompi/.libs/libmpi.so /root/RRI/openmpi-1.3/orte/.libs/libopen-rte.so /root/RRI/openmpi-1.3/opal/.libs/libopen-pal.so -ldl -lnsl -lutil -lpthread -Wl,-soname -Wl,libmpi_cxx.so.0 -o .libs/libmpi_cxx.so.0.0.0
/usr/bin/ld: .libs/mpicxx.o: relocation R_X86_64_32S against __vtbl__Q2_3MPI8Datatype' can not be used when making a shared object; recompile with -fPIC .libs/mpicxx.o: could not read symbols: Bad value make[2]: *** [libmpi_cxx.la] Error 2 make[2]: Leaving directory /root/RRI/openmpi-1.3/ompi/mpi/cxx’
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/root/RRI/openmpi-1.3/ompi’
make: *** [all-recursive] Error 1

Then I used CFLAGS=-O4 -fPIC

reconfigurred and it failed with a different error during make.

PGC-W-0221-Redefinition of symbol offsetof (/usr/lib/gcc/x86_64-redhat-linux/3.4.6/include/stddef.h: 414)
PGC-W-0258-Argument 1 in macro offsetof is not identical to previous definition (/usr/lib/gcc/x86_64-redhat-linux/3.4.6/include/stddef.h: 414)
PGC-W-0258-Argument 2 in macro offsetof is not identical to previous definition (/usr/lib/gcc/x86_64-redhat-linux/3.4.6/include/stddef.h: 414)
PGC/x86-64 Linux 8.0-4: compilation completed with warnings
source=‘btl_openib_endpoint.c’ object=‘btl_openib_endpoint.lo’ libtool=yes
DEPDIR=.deps depmode=none /bin/sh …/…/…/…/config/depcomp
/bin/sh …/…/…/…/libtool --tag=CC --mode=compile /opt/pgi/linux86-64/8.0-4/bin/pgcc -DHAVE_CONFIG_H -I. -I…/…/…/…/opal/include -I…/…/…/…/orte/include -I…/…/…/…/ompi/include -I…/…/…/…/opal/mca/paffinity/linux/plpa/src/libplpa -I…/…/…/… -D_REENTRANT -DNDEBUG -O4 -fPIC -c -o btl_openib_endpoint.lo btl_openib_endpoint.c
libtool: compile: /opt/pgi/linux86-64/8.0-4/bin/pgcc -DHAVE_CONFIG_H -I. -I…/…/…/…/opal/include -I…/…/…/…/orte/include -I…/…/…/…/ompi/include -I…/…/…/…/opal/mca/paffinity/linux/plpa/src/libplpa -I…/…/…/… -D_REENTRANT -DNDEBUG -O4 -fPIC -c btl_openib_endpoint.c -fpic -DPIC -o .libs/btl_openib_endpoint.o
NOTE: your trial license will expire in 10 days, 6.52 hours.
PGC-S-0060-transport_type is not a member of this struct or union (btl_openib_endpoint.c: 662)
PGC-S-0039-Use of undeclared variable IBV_TRANSPORT_IB (btl_openib_endpoint.c: 662)
PGC/x86-64 Linux 8.0-4: compilation completed with severe errors
make[2]: *** [btl_openib_endpoint.lo] Error 1
make[2]: Leaving directory /root/RRI/openmpi-1.3/ompi/mca/btl/openib' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory /root/RRI/openmpi-1.3/ompi’
make: *** [all-recursive] Error 1

With other compilers Open MPI gets installed very easily. But its failing with PGI only.

What could be the solution for this?

Thanks,
Sangamesh

Its failing with mixed compilation mode also:

For compiling Open MPI I used gcc & g++ for CC and CXX and for Fortran: pgf90/pgf95 is used.

But it failed again:

/bin/sh …/…/…/libtool --mode=compile /opt/pgi/linux86-64/8.0-4/bin/pgf90 -I…/…/…/ompi/include -I…/…/…/ompi/include -I. -I. -I…/…/…/ompi/mpi/f90 -O3 -fPIC -c -o mpi_wtick_f90.lo mpi_wtick_f90.f90
libtool: compile: /opt/pgi/linux86-64/8.0-4/bin/pgf90 -I…/…/…/ompi/include -I…/…/…/ompi/include -I. -I. -I…/…/…/ompi/mpi/f90 -O3 -fPIC -c mpi_wtick_f90.f90 -fpic -o .libs/mpi_wtick_f90.o
NOTE: your trial license will expire in 10 days, 4.64 hours.
NOTE: your trial license will expire in 10 days, 4.64 hours.
/bin/sh …/…/…/libtool --mode=compile /opt/pgi/linux86-64/8.0-4/bin/pgf90 -I…/…/…/ompi/include -I…/…/…/ompi/include -I. -I. -I…/…/…/ompi/mpi/f90 -O3 -fPIC -c -o mpi_wtime_f90.lo mpi_wtime_f90.f90
libtool: compile: /opt/pgi/linux86-64/8.0-4/bin/pgf90 -I…/…/…/ompi/include -I…/…/…/ompi/include -I. -I. -I…/…/…/ompi/mpi/f90 -O3 -fPIC -c mpi_wtime_f90.f90 -fpic -o .libs/mpi_wtime_f90.o
NOTE: your trial license will expire in 10 days, 4.64 hours.
NOTE: your trial license will expire in 10 days, 4.64 hours.
/bin/sh …/…/…/libtool --mode=link /opt/pgi/linux86-64/8.0-4/bin/pgf90 -I…/…/…/ompi/include -I…/…/…/ompi/include -I. -I. -I…/…/…/ompi/mpi/f90 -O3 -fPIC -export-dynamic -o libmpi_f90.la -rpath /opt/openmpi/1.3/pgi/lib mpi.lo mpi_sizeof.lo mpi_comm_spawn_multiple_f90.lo mpi_testall_f90.lo mpi_testsome_f90.lo mpi_waitall_f90.lo mpi_waitsome_f90.lo mpi_wtick_f90.lo mpi_wtime_f90.lo …/…/…/ompi/libmpi.la -lnsl -lutil -lm
libtool: link: /opt/pgi/linux86-64/8.0-4/bin/pgf90 -shared -fpic -Mnomain .libs/mpi.o .libs/mpi_sizeof.o .libs/mpi_comm_spawn_multiple_f90.o .libs/mpi_testall_f90.o .libs/mpi_testsome_f90.o .libs/mpi_waitall_f90.o .libs/mpi_waitsome_f90.o .libs/mpi_wtick_f90.o .libs/mpi_wtime_f90.o -Wl,-rpath -Wl,/root/RRI/openmpi-1.3/ompi/.libs -Wl,-rpath -Wl,/root/RRI/openmpi-1.3/orte/.libs -Wl,-rpath -Wl,/root/RRI/openmpi-1.3/opal/.libs -Wl,-rpath -Wl,/opt/openmpi/1.3/pgi/lib -L/root/RRI/openmpi-1.3/orte/.libs -L/root/RRI/openmpi-1.3/opal/.libs …/…/…/ompi/.libs/libmpi.so /root/RRI/openmpi-1.3/orte/.libs/libopen-rte.so /root/RRI/openmpi-1.3/opal/.libs/libopen-pal.so -ldl -lnsl -lutil -lm -pthread -Wl,-soname -Wl,libmpi_f90.so.0 -o .libs/libmpi_f90.so.0.0.0
pgf90-Error-Unknown switch: -pthread
make[4]: *** [libmpi_f90.la] Error 1
make[4]: Leaving directory /root/RRI/openmpi-1.3/ompi/mpi/f90' make[3]: *** [all-recursive] Error 1 make[3]: Leaving directory /root/RRI/openmpi-1.3/ompi/mpi/f90’
make[2]: *** [all] Error 2
make[2]: Leaving directory /root/RRI/openmpi-1.3/ompi/mpi/f90' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory /root/RRI/openmpi-1.3/ompi’
make: *** [all-recursive] Error 1

What’s the way out?

Is this a bug with PGI?

Thanks,
Sangamesh

Hi Sangamesh,

These all look like problems with your configuration. How are you configuring OpenMPI?

If works for me using the following commands:

setenv PATH /opt/pgi/linux86-64/8.0-4/bin/:$PATH
tar jxvf openmpi-1.3.tar.bz2
cd openmpi-1.3/
./configure CC=pgcc CXX=pgCC FC=pgf77 F90=pgf90 --prefix=/opt/pgi/linux86-64/8.0-4/openmpi
make
make install
  • Mat

Hi,

Now Open MPI is reconfigured as follows:

[root@localhost openmpi-1.3]# echo $CC
/opt/pgi/linux86-64/8.0-4/bin/pgcc
[root@localhost openmpi-1.3]# echo $CXX
/opt/pgi/linux86-64/8.0-4/bin/pgCC
[root@localhost openmpi-1.3]# echo $F77
/opt/pgi/linux86-64/8.0-4/bin/pgf77
[root@localhost openmpi-1.3]# echo $F90
/opt/pgi/linux86-64/8.0-4/bin/pgf90

# ./configure --prefix=/opt/openmpi/1.3/pgi

Configure is successful

During make, there were lots of warnings similar to the below:

PGC-W-0221-Redefinition of symbol offsetof (/usr/lib/gcc/x86_64-redhat-linux/3.4.6/include/stddef.h: 414)
PGC-W-0258-Argument 1 in macro offsetof is not identical to previous definition (/usr/lib/gcc/x86_64-redhat-linux/3.4.6/include/stddef.h: 414)
PGC-W-0258-Argument 2 in macro offsetof is not identical to previous definition (/usr/lib/gcc/x86_64-redhat-linux/3.4.6/include/stddef.h: 414)
PGC-W-0221-Redefinition of symbol offsetof (/usr/lib/gcc/x86_64-redhat-linux/3.4.6/include/stddef.h: 414)
PGC-W-0258-Argument 1 in macro offsetof is not identical to previous definition (/usr/lib/gcc/x86_64-redhat-linux/3.4.6/include/stddef.h: 414)
PGC-W-0258-Argument 2 in macro offsetof is not identical to previous definition (/usr/lib/gcc/x86_64-redhat-linux/3.4.6/include/stddef.h: 414)
PGC-W-0221-Redefinition of symbol offsetof (/usr/lib/gcc/x86_64-redhat-linux/3.4.6/include/stddef.h: 414)
PGC-W-0258-Argument 1 in macro offsetof is not identical to previous definition (/usr/lib/gcc/x86_64-redhat-linux/3.4.6/include/stddef.h: 414)
PGC-W-0258-Argument 2 in macro offsetof is not identical to previous definition (/usr/lib/gcc/x86_64-redhat-linux/3.4.6/include/stddef.h: 414)
PGC/x86-64 Linux 8.0-4: compilation completed with warnings

Why its giving errors wrt gcc?

How pgi really works? On top of GCC libraries?

Finally the make failed with:

PGC-W-0258-Argument 1 in macro offsetof is not identical to previous definition (/usr/lib/gcc/x86_64-redhat-linux/3.4.6/include/stddef.h: 414)
PGC-W-0258-Argument 2 in macro offsetof is not identical to previous definition (/usr/lib/gcc/x86_64-redhat-linux/3.4.6/include/stddef.h: 414)
PGC/x86-64 Linux 8.0-4: compilation completed with warnings
source='btl_openib_endpoint.c' object='btl_openib_endpoint.lo' libtool=yes \
DEPDIR=.deps depmode=none /bin/sh ../../../../config/depcomp \
/bin/sh ../../../../libtool --tag=CC   --mode=compile /opt/pgi/linux86-64/8.0-4/bin/pgcc -DHAVE_CONFIG_H -I. -I../../../../opal/include -I../../../../orte/include -I../../../../ompi/include -I../../../../opal/mca/paffinity/linux/plpa/src/libplpa   -I../../../..  -D_REENTRANT  -O -DNDEBUG   -c -o btl_openib_endpoint.lo btl_openib_endpoint.c
libtool: compile:  /opt/pgi/linux86-64/8.0-4/bin/pgcc -DHAVE_CONFIG_H -I. -I../../../../opal/include -I../../../../orte/include -I../../../../ompi/include -I../../../../opal/mca/paffinity/linux/plpa/src/libplpa -I../../../.. -D_REENTRANT -O -DNDEBUG -c btl_openib_endpoint.c  -fpic -DPIC -o .libs/btl_openib_endpoint.o
NOTE: your trial license will expire in 9 days, 13 hours.
PGC-S-0060-transport_type is not a member of this struct or union (btl_openib_endpoint.c: 662)
PGC-S-0039-Use of undeclared variable IBV_TRANSPORT_IB (btl_openib_endpoint.c: 662)
PGC/x86-64 Linux 8.0-4: compilation completed with severe errors
make[2]: *** [btl_openib_endpoint.lo] Error 1
make[2]: Leaving directory `/root/RRI/openmpi-1.3/ompi/mca/btl/openib'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/root/RRI/openmpi-1.3/ompi'
make: *** [all-recursive] Error

Here are the some more details of OS environment:

The processor is AMD Quad socket, quad core Opteron x86_64 @ 2.7Ghz

uname -a

Linux localhost.localdomain 2.6.9-55.ELlargesmp #1 SMP Fri Apr 20 16:46:56 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux

cat /etc/redhat-release

Red Hat Enterprise Linux AS release 4 (Nahant Update 5)

\

gcc --version

gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-8)
Copyright (C) 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Also, as a testing purpose installed Open MPI 1.2.7 with PGI. It got installed!
Then I tried to compile Fortran application. It went well till last stage of compilation and failed while linking the libraries of Open mpi:

qm_div.o force.o \
        ../lmod/lmod.a ../lapack/lapack.a ../blas/blas.a \
        ../lib/nxtsec.o ../lib/sys.a  -L/opt/openmpi/1.2.7/pgi/lib -lmpi_f90 -lmpi_f77 -lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl -lutil -lpthread -ldl
/usr/bin/ld: skipping incompatible /opt/openmpi/1.2.7/pgi/lib/libmpi_f90.so when searching for -lmpi_f90
/usr/bin/ld: cannot find -lmpi_f90
make[1]: *** [exename] Error 2
make[1]: Leaving directory `/opt/apps/fortran_apps_ompi/src/exename''
make: *** [parallel] Error 2

The libraries of Open MPI are as follows:

# ls /opt/openmpi/1.2.7/pgi/lib/
libmca_common_sm.la        libmpi_cxx.so        libmpi_f77.so.0      libmpi_f90.so.0.0.0  libopen-pal.la        libopen-rte.so
libmca_common_sm.so        libmpi_cxx.so.0      libmpi_f77.so.0.0.0  libmpi.la            libopen-pal.so        libopen-rte.so.0
libmca_common_sm.so.0      libmpi_cxx.so.0.0.0  libmpi_f90.la        libmpi.so            libopen-pal.so.0      libopen-rte.so.0.0.0
libmca_common_sm.so.0.0.0  libmpi_f77.la        libmpi_f90.so        libmpi.so.0          libopen-pal.so.0.0.0  mpi.mod
libmpi_cxx.la              libmpi_f77.so        libmpi_f90.so.0      libmpi.so.0.0.0      libopen-rte.la

With the same OS envronment, but on intel processor with intel compilers, both Open MPI and the fortran application works without any problems.

Why so its not working with PGI?

Thanks,
Sangamesh

Hi,

FYI,
In the same environment, I’m able to compile Open MPI-1.3 then the fortran application with Intel 10 compilers successfully.

So what’s the wrong with PGI?

Thanks,
Sangamesh

Hi sangamesh,

Why its giving errors wrt gcc?

These aren’t errors, just warnings that can be ignored.

How pgi really works? On top of GCC libraries?

On Linux, it’s necessary to link with GCC’s C runtime library as well as other system libraries in order to interface with the OS. We also need to replace or modify a few system header files in order to work around GNU specific extensions.

Finally the make failed with:

PGC-S-0060-transport_type is not a member of this struct or union (btl_openib_endpoint.c: 662)
PGC-S-0039-Use of undeclared variable IBV_TRANSPORT_IB (btl_openib_endpoint.c: 662)

I investigated this error and determined that the struct member “transport_type” is part of the “ibv_device” struct found in your system’s Infiniband “verbs.h” header file. I also found this OpenMPI Bug Report regarding a similar error specific to RHEL4U3. Given this, my guess is that RHEL4U5 changed the infiniband header files from what OpenMPI is expecting. Please report this problem to OpenMPI.

/usr/bin/ld: skipping incompatible /opt/openmpi/1.2.7/pgi/lib/libmpi_f90.so when searching for -lmpi_f90
/usr/bin/ld: cannot find -lmpi_f90

This means that you’re trying to link a 32-bit object with a 64-bit library (or visa versa)

Hope this helps,
Mat