Runtime error: MPI_Comm_set_errhandler does not work with predefined error handlers when using mpi_f08

Here is the reproducer:

$ cat test.F90
program test
#ifdef BROKEN
use mpi_f08
#else
use mpi
#endif
call MPI_INIT(ierr)
call MPI_Comm_set_errhandler(MPI_COMM_WORLD, MPI_ERRORS_RETURN, ierr )
call mpi_finalize(ierr)
end program test
$ mpif90 test.F90
$ mpirun -np 1 ./a.out
$ mpif90 -DBROKEN test.F90
$ mpirun -np 1 ./a.out
[anniesavoy:35481] *** An error occurred in MPI_Comm_set_errhandler
[anniesavoy:35481] *** reported by process [2748841985,0]
[anniesavoy:35481] *** on communicator MPI_COMM_WORLD
[anniesavoy:35481] *** MPI_ERR_ARG: invalid argument of some other kind
[anniesavoy:35481] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[anniesavoy:35481] *** and potentially your MPI job)
$ which mpif90
/opt/nvidia/hpc_sdk/Linux_x86_64/2021/comm_libs/mpi/bin/mpif90
$

Thank you,

John

Hi John,

Which compiler version and MPI are you using?

I’ve tried the 21.5 and 21,7’s OpenMPI 3.1.5, OpenMPI 4.0.5, and HPC-X, but can’t recreate this error. The only issue I see is the following warning when using 3.1.5 which may or may not be related:

% mpif90 -DBROKEN test.F90 -V21.5; mpirun -np 1 ./a.out
/usr/bin/ld: Warning: alignment 4 of symbol `ompi_f08_mpi_errors_return’ in /proj/nv/Linux_x86_64/21.5/comm_libs/openmpi/openmpi-3.1.5/lib/libmpi_usempif08.so is smaller than 8 in /tmp/nvfortranFx5lDVTqpuIO.o

-Mat

Hi Mat, thanks for the quick reply:

$ pgf90 -V

pgf90 (aka nvfortran) 21.1-0 LLVM 64-bit target on x86-64 Linux -tp sandybridge
PGI Compilers and Tools
Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.

$ which mpif90
/opt/nvidia/hpc_sdk/Linux_x86_64/2021/comm_libs/mpi/bin/mpif90

Following the trail of symlinks for the install, it looks like it’s picking up the Open MPI installation here:

/opt/nvidia/hpc_sdk/Linux_x86_64/21.1/comm_libs/openmpi/openmpi-3.1.5

PS… Here is LD_LIBRARY_PATH:

echo $LD_LIBRARY_PATH
/home/michalak/p4est-2.0.374-pgi/lib:/home/michalak/hdf5-1.10.5-pgi/lib:/home/michalak/p4est-2.0.374-pgi/lib:/home/michalak/hdf5-1.10.5-pgi/lib:/opt/nvidia/hpc_sdk/Linux_x86_64/2021/cuda/11.2/extras/CUPTI/lib64:/opt/nvidia/hpc_sdk/Linux_x86_64/2021/cuda/lib64:/opt/nvidia/hpc_sdk/Linux_x86_64/2021/compilers/lib:/opt/nvidia/hpc_sdk/Linux_x86_64/2021/comm_libs/mpi/lib:/opt/nvidia/hpc_sdk/Linux_x86_64/2021/lib64:/home/michalak/netcdf-4.7.4-intel/lib:/home/michalak/hdf5-1.8.14-ifort13/hdf5/lib:/home/michalak/p4est-2.0.374-intel/lib:/home/michalak/gcc/install/lib64:/home/michalak/pin-2.14-67254-gcc.4.4.7-linux/source/tools/ManualExamples/obj-intel64

I tried 21.1, but same results. Just a warning about alignment when compiling with 3.1.5. I’ve asked Chris to take a look for ideas.

Though, can you try using the OpenMPI 4.install to see if it works around the issue? (/opt/nvidia/hpc_sdk/Linux_x86_64/21.1/comm_libs/openmpi4/openmpi-4.0.5/)

Hi Mat,

I switched my environment to use the executables and libraries from OpenMPI 4 that is part of the current hpc_sdk installation on my system. With this change, the ./a.out produced by compiling the reproducer:

mpif90 -DBROKEN test.F90

runs correctly, without the run-time error in the call to MPI_Comm_set_errhandler.

As always, thanks,

John

PS. For other users: to modify my shell environment I set:

setenv PATH /opt/nvidia/hpc_sdk/Linux_x86_64/2021/comm_libs/openmpi4/openmpi-4.0.5/bin:$PATH
setenv LD_LIBRARY_PATH /opt/nvidia/hpc_sdk/Linux_x86_64/2021/comm_libs/openmpi4/openmpi-4.0.5/lib:$LD_LIBRARY_PATH

in my .cshrc file. There may be a more elegant way to do this.