error while loading shared libraries: libpgmp.so

I’m trying to run a trivial MPI

hello.c

code over IB (mvapich) using PGI CDK 12.4.

When I submit a test job I get:

/pool/cluster7/hpc/pgi/linux86-64/2012/mpi/mvapich/bin/mpirun_rsh: error while loading shared libraries: libpgmp.so: cannot open shared object file: No such file or directory

note that

% ldd /pool/cluster7/hpc/pgi/linux86-64/2012/mpi/mvapich/bin/mpirun_rsh
        linux-vdso.so.1 =>  (0x00007fff7fe82000)
        libm.so.6 => /lib64/libm.so.6 (0x0000003e4da00000)
        libpgmp.so => /pool/cluster7/hpc/pgi/linux86-64/12.4/libso/libpgmp.so (0x00002b17a05ec000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003e4e200000)
        libpgc.so => /pool/cluster7/hpc/pgi/linux86-64/12.4/libso/libpgc.so (0x00002b17a075b000)
        libc.so.6 => /lib64/libc.so.6 (0x0000003e4d600000)
        /lib64/ld-linux-x86-64.so.2 (0x0000003e4d200000)

and

ls -l /pool/cluster7/hpc/pgi/linux86-64/12.4/libso/libpgmp.so
-rwxr-xr-x 2 hpc hpc 237023 Apr 13 14:59 /pool/cluster7/hpc/pgi/linux86-64/12.4/libso/libpgmp.so

If I add

setenv LD_LIBRARY_PATH /pool/cluster7/hpc/pgi/linux86-64/12.4/libso:$LD_LIBRARY_PATH

the err message is now

/pool/cluster7/hpc/pgi/linux86-64/2012/mpi/mvapich/bin/mpispawn: error while loading shared libraries: libpgmp.so: cannot open shared object file: No such file or directory

(the compute node ‘sees’ fine the

/pool/cluster7/hpc/pgi/linux86-64/12.4/libso/libpgc.so

file.)

% ldd /pool/cluster7/hpc/pgi/linux86-64/2012/mpi/mvapich/bin/mpispawn
        linux-vdso.so.1 =>  (0x00007fff871fc000)
        libpgmp.so => /pool/cluster7/hpc/pgi/linux86-64/12.4/libso/libpgmp.so (0x00002b38c7fc8000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003e4e200000)
        libpgc.so => /pool/cluster7/hpc/pgi/linux86-64/12.4/libso/libpgc.so (0x00002b38c8150000)
        libm.so.6 => /lib64/libm.so.6 (0x0000003e4da00000)
        libc.so.6 => /lib64/libc.so.6 (0x0000003e4d600000)
        /lib64/ld-linux-x86-64.so.2 (0x0000003e4d200000)

What am I missing? Any hint?

Thx, S.

Hi Sylvain,

Is the “/pool/cluster7/hpc/pgi/linux86-64/12.4/libso/” accessible from the remote node? Is the LD_LIBRARY_PATH being set correctly on the remote node when mpiexec invoked?

Note, you can try compiling with -Bstatic_pgi to force static linking of the PGI run time libraries.

  • Mat

Hi Mat,

The submitted script, hello.csh, has both

 setenv LD_LIBRARY_PATH /pool/cluster7/hpc/pgi/linux86-6/12.4/libso:$LD_LIBRARY_PATH
 ls -l /pool/cluster7/hpc/pgi/linux86-64/12.4/libso/libpgc.so

so /pool/cluster7 is cross mounted on the compute node(s) and the setenv changes the err message from

[...]/mpirun_rsh: error while loading shared libraries: [...]

w/out the setenv to

[...]/mpispawn: error while loading shared libraries: [...]

w/ the setenv

Also, -Bstatic_pgi doesn’t fix this, the libpgmp.so is needed by mpirun_rsh and mpispawn, not by hello:

% ldd hello
        linux-vdso.so.1 =>  (0x00007fffcdb89000)
        libibverbs.so.1 => /usr/lib64/libibverbs.so.1 (0x0000003b60600000)
        libibumad.so.3 => /usr/lib64/libibumad.so.3 (0x0000003b60a00000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003e4e200000)
        librt.so.1 => /lib64/librt.so.1 (0x0000003e4ea00000)
        libm.so.6 => /lib64/libm.so.6 (0x0000003e4da00000)
        libc.so.6 => /lib64/libc.so.6 (0x0000003e4d600000)
        libdl.so.2 => /lib64/libdl.so.2 (0x0000003e4de00000)
        /lib64/ld-linux-x86-64.so.2 (0x0000003e4d200000)

If mpirun_rsh runs (thanks to setenv LD_LIBRARY_PATH) why mpispawn doesn’t sees it???

Have I missed something when installing the CDK (12.4)?

Thx, S.

so, the fix, is to put the

  setenv LD_LIBRARY_PATH /pool/cluster7/hpc/pgi/linux86-64/12.4/libso:$LD_LIBRARY_PATH

in the ~/.cshrc, so it is propagated to the other ‘ssh’ on the other nodes (or spawn shells)…

The solution is thus much simpler… S.

in the ~/.cshrc, so it is propagated to the other ‘ssh’ on the other nodes (or spawn shells)…

This would have been my next suggestion. In general, I don’t personally like doing this but will from time to time.

the libpgmp.so is needed by mpirun_rsh and mpispawn,

Sorry, my misunderstanding. What you may want to do then is go back and rebuild MVAPICH with the “-Bstatic_pgi” flag to link in our static libraries. My guess it was compiled with “-Bdynamic”.

  • Mat

Mat,

I installed the PGI CDK as provided, that I assume should be ‘optimal’… I don’t want to rebuild the distirb when the tool kit provide one.

S.

Hi S.

No, the MPICH installs that come with the CDK are debug versions for use with the PGI debugger and profiler. They are also configured for use with Ethernet so wont be able to take advantage of any optimized hardware you may have. For performance, you need to use the MPI implementation recommended by your hardware vendor.

  • Mat