At the linking stage of our fortran application, when using the use mpi_f08 fortran bindings, I see the following messages with HPC SDK 21.3 (Power9).
Is this something I should be concerned about or can I safely ignore these warnings?
Cheers, Thomas
/usr/bin/ld: Warning: alignment 4 of symbol `ompi_f08_mpi_double_precision' in /m100/prod/opt/compilers/hpc-sdk/2021/binary/Linux_ppc64le/21.3/comm_libs/openmpi/openmpi-3.1.5/lib/libmpi_usempif08.so is smaller than 8 in obj/opt_acc/control_all.o
/usr/bin/ld: Warning: alignment 4 of symbol `ompi_f08_mpi_integer' in /m100/prod/opt/compilers/hpc-sdk/2021/binary/Linux_ppc64le/21.3/comm_libs/openmpi/openmpi-3.1.5/lib/libmpi_usempif08.so is smaller than 8 in obj/opt_acc/control_all.o
/usr/bin/ld: Warning: alignment 4 of symbol `ompi_f08_mpi_ub' in /m100/prod/opt/compilers/hpc-sdk/2021/binary/Linux_ppc64le/21.3/comm_libs/openmpi/openmpi-3.1.5/lib/libmpi_usempif08.so is smaller than 8 in obj/opt_acc/pputil.o
/usr/bin/ld: Warning: alignment 4 of symbol `ompi_f08_mpi_min' in /m100/prod/opt/compilers/hpc-sdk/2021/binary/Linux_ppc64le/21.3/comm_libs/openmpi/openmpi-3.1.5/lib/libmpi_usempif08.so is smaller than 8 in obj/opt_acc/pputil.o
/usr/bin/ld: Warning: alignment 4 of symbol `ompi_f08_mpi_real8' in /m100/prod/opt/compilers/hpc-sdk/2021/binary/Linux_ppc64le/21.3/comm_libs/openmpi/openmpi-3.1.5/lib/libmpi_usempif08.so is smaller than 8 in obj/opt_acc/control_all.o
/usr/bin/ld: Warning: alignment 4 of symbol `ompi_f08_mpi_comm_null' in /m100/prod/opt/compilers/hpc-sdk/2021/binary/Linux_ppc64le/21.3/comm_libs/openmpi/openmpi-3.1.5/lib/libmpi_usempif08.so is smaller than 8 in obj/opt_acc/solver/interfaceSolverInOrb5.o
/usr/bin/ld: Warning: alignment 4 of symbol `ompi_f08_mpi_integer8' in /m100/prod/opt/compilers/hpc-sdk/2021/binary/Linux_ppc64le/21.3/comm_libs/openmpi/openmpi-3.1.5/lib/libmpi_usempif08.so is smaller than 8 in obj/opt_acc/parmove.o
I also encountered this issue. After following your suggestion to switch from Open MPI 3 to Open MPI 4 by updating the PATH and LD_LIBRARY_PATH, I noticed a significant performance drop. Open MPI 4 is much slower than Open MPI 3, even though I am using the same code and compiler flags.
Here are the details of my environment:
CUDA 11.8
NVHPC 22.11
Compiler Command: mpif90 -O3 -cpp -tp=zen3 -gpu=ptxinfo -g -Wall -Minfo
I used Nsight Systems to profile these two cases, and I found an interesting result. For the kernels, the time costs are almost the same. However, for data transfers (implemented in Open MPI), Open MPI 4 takes significantly longer — more than 5x compared to Open MPI 3.
Additionally, I noticed that in Open MPI 4, the transferred data appears as pageable, while in Open MPI 3, it is pinned. I understand that data is typically pinned during the send-receive process, but I’m not sure what changed in the Open MPI 4 code to cause this behavior.
Do you have insights into why this might be happening or how to resolve it?