Problem after switching version of compiler

I have a student license with which I have been compiling code using version 15.10 and mpich and it works fine.

To use it on the cluster we have now taken a academic license and it uses 16.5. I can also see that it now uses openmpi instead of mpich.

I can compile it fine even using the newer version but now after compiling I get the following error.

--------------------------------------------------------------------------
Sorry!  You were supposed to get help about:
    dlsym failed
But I couldn't open the help file:
    /proj/pgi/linux86-64/2016/mpi/openmpi-1.10.2/share/openmpi/help-mpi-common-cuda.txt: No such file or directory.  Sorry!
--------------------------------------------------------------------------
--------------------------------------------------------------------------
Sorry!  You were supposed to get help about:
    ini file:file not found
But I couldn't open the help file:
    /proj/pgi/linux86-64/2016/mpi/openmpi-1.10.2/share/openmpi/help-mpi-btl-openib.txt: No such file or directory.  Sorry!
--------------------------------------------------------------------------
--------------------------------------------------------------------------
Sorry!  You were supposed to get help about:
    no device params found
But I couldn't open the help file:
    /proj/pgi/linux86-64/2016/mpi/openmpi-1.10.2/share/openmpi/help-mpi-btl-openib.txt: No such file or directory.  Sorry!
--------------------------------------------------------------------------
 Initializing mesh ...
 Processor_? opening MeshFile tgv16.hdf5
 
 Initializing solution ...
 
 Setting fixed dt =   2.5000000000000001E-004
 
STEP =       1  TIME = 0.0000000000E+00  DT = 2.5000000000E-04
 
[05491] *** Process received signal ***
[05491] Signal: Segmentation fault (11)
[05491] Signal code: Address not mapped (1)
[05491] Failing at address: (nil)
[05491] [ 0] /lib64/libpthread.so.0[0x30a580f790]
[05491] *** End of error message ***
Segmentation fault (core dumped)

I get the feeling that the changed mpi version maybe causing this.

To add to it, if it is relevant, I can run it fine with gfortran compilers as well.

EDIT: So this is weird. We have some nodes with GPU’s and some that don’t. The code seems to work on GPU nodes but not on those without. Note, that I have not switched on cuda compilation options, so it should not require GPU.

EDIT: Even pgprof starts and then exits giving a similar error while nvprof runs fine.

[gn012:11262] *** Process received signal ***
[gn012:11262] Signal: Segmentation fault (11)
[gn012:11262] Signal code: Invalid permissions (2)
[gn012:11262] Failing at address: 0x800000000
[gn012:11262] [ 0] /lib64/libpthread.so.0[0x316ee0f710]
[gn012:11262] [ 1] /tmp/tmpxft_00002bfc_00000000-0/libprofileInterceptor.so(+0x1017d5)[0x7ffff7d967d5]
[gn012:11262] [ 2] /tmp/tmpxft_00002bfc_00000000-0/libprofileInterceptor.so(+0x101097)[0x7ffff7d96097]
[gn012:11262] [ 3] /tmp/tmpxft_00002bfc_00000000-0/libprofileInterceptor.so(+0x100ce7)[0x7ffff7d95ce7]
[gn012:11262] [ 4] /tmp/tmpxft_00002bfc_00000000-0/libprofileInterceptor.so(+0x18db6)[0x7ffff7caddb6]
[gn012:11262] [ 5] /tmp/tmpxft_00002bfc_00000000-0/libprofileInterceptor.so(+0x18eef)[0x7ffff7cadeef]
[gn012:11262] [ 6] /lib64/libpthread.so.0[0x316ee0f710]
[gn012:11262] [ 7] /usr/local/pgi-2016/linux86-64/2016/cuda/7.5-pgprof/lib64/libcuinj64.so(+0x1be097)[0x7fffa350b097]
[gn012:11262] [ 8] [0x7fffb2fbc2c0]
[gn012:11262] *** End of error message ***

Re-install the 16.5 Linux compilers. When you are asked to install OpenMPI,
say yes, but say ‘No’ to ‘Do you want to enable NVIDIA GPU support in Open MPI?’.

Then make sure OpenMPI is in your environment.

export PATH=$PGI/linux86-64/16.5/bin:$PGI/linux86-64/2016/mpi/openmpi/bin:$PATH

Now try mpif90 and mpirun.

dave

Is it possible that I could install an alternate version of openmpi without cuda and link to this library.

If so, how would I do it?

OpenMPI is built to install in the $PGI directory.

Try installing the compilers (everything) in /tmp/pgi, and verify that they
work with the non-gpu version of OpenMPI.

dave

I was asking because I have to ask HPC admin to reinstall pgi whereas locally I can reinstall openmpi.

Thanks.