Howto build OpenMPI with nvhpc/24.1

patrick.begou · February 19, 2024, 9:25am

Hi,

I have some troubles to compile and run my code with nvhpc/24.1 and OpenMPI.

with the provided OpenMPI version nvhpc-openmpi3/24.1 my code compiles but is not compatible with the slurm setup of the cluster (slurm is 20.11.7-1)

The application appears to have been direct launched using "srun",
but OMPI was not built with SLURM's PMI support and therefore cannot
execute. There are several options for building PMI support under
SLURM, depending upon the SLURM version you are using:

I must launch with srun in the batch file as using mpirun do not set some Slurm variables required for accessing the GPUs.

As with Gnu compiler we use OpenMPI 4.1.4 I’ve built this version using nvhpc-nompi/24.1 and the same setup used for GNU:

export cuda=/opt/nvidia/hpc_sdk/Linux_x86_64/24.1/cuda
../configure --with-hwloc --enable-mpirun-prefix-by-default \
  --prefix=$dest --with-pmi --enable-mpi1-compatibility \
  --with-ucx=$dest --enable-mpi-cxx --with-slurm \
  --enable-pmix-timing --with-pmix --without-verbs \
  --with-cuda=$cuda

These options do not seams really different from the one returned by ompi_info from nvhpc-openmpi3/24.1

But we are using wrappers in front of MPI calls as showed in the small test case and with nvhpc-nompi/24.1 + OpenMPI 4.1.4 I cannot compile successfully:
bash-4.4$ mpifort --show
nvfortran -I/opt/nvidia/openmpi-legi/4.1.4/include -I/opt/nvidia/openmpi-legi/4.1.4/lib -L/opt/nvidia/openmpi-legi/4.1.4/lib -rpath /opt/nvidia/openmpi-legi/4.1.4/lib -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi
bash-4.4$ mpifort -c comm.f90
NVFORTRAN-S-0155-Could not resolve generic procedure mpi_bcast (comm.f90: 22)
0 inform, 0 warnings, 1 severes, 0 fatal for my_bcast_character
NVFORTRAN-S-0155-Could not resolve generic procedure mpi_bcast (comm.f90: 44)
0 inform, 0 warnings, 1 severes, 0 fatal for my_bcast_logical_scalar

This code compiles successfully with:

nvhpc-openmpi3/24.1
OpenMPI 4.1.4 and GNU OG13 branch

May be there are special things to do when building OpenMPI with Nvidia SDK ?

comm.f90.txt (1.9 KB)

Thanks for your advices

Patrick

cparrott · February 20, 2024, 10:26pm

Hi Patrick,

This is a known issue with recent versions of Open MPI and the NVIDIA HPC compilers.

As a workaround, try adding -Mstandard to the FCFLAGS variable when you invoke the ./configure script of Open MPI.

Hope this helps.

+chris

patrick.begou · February 28, 2024, 1:41pm

Thanks Chris, this option solves the problem.
This is now my basic compilers setup to build openmpi 4.1.4 with nvhpc-nompi/24.1

CC=nvc++ CXX=nvc++ FC=nvfortran CFLAGS=-fPIC CXXFLAGS=-fPIC FCFLAGS=-fPIC \
FCFLAGS="-Mstandard -fPIC" CFLAGS=-fPIC CXXFLAGS=-fPIC \
  ../configure .....

and it works!
(Now I’ve to check if GPU to GPU communications works too but they should as they did with using include "mpif.h" as a workaround with the previous OpenMPI build)

Patrick

patrick.begou · March 4, 2024, 7:23am

Hi,
I have some additional question about this setup of OpenMPI with Nvidia GPU. I’me running some tests on a node with 2 PCI4 A100 GPU. I’m using osu-micro-benchmarks-3.8 tests (unable to compile the latest version with nvhpc).
I slightly modified osu_bw.c to take account of slurm resources (to be sure the 2 processes are offloaded on a distinct GPU). The maximum bandwith reached is 16GB/s for osu_bw, so this is half a PCI4-16x bandwith.

I’ve read about GPUDirect (Benchmark Tests - NVIDIA Docs) but it seams to be related to “GPU-Node-Node-GPU” communications. Should I use it too for intranode communications ? And modify my OpenMPI setup ?

# OSU MPI-CUDA Bandwidth Test v3.8
# Send Buffer on DEVICE (D) and Receive Buffer on DEVICE (D)
# Size      Bandwidth (MB/s)
1                       0.09
2                       0.18
4                       0.37
8                       0.73
16                      1.52
32                      2.80
64                      5.49
128                    10.75
256                    21.71
512                    43.51
1024                   84.30
2048                  172.98
4096                  330.46
8192                  578.57
16384                3013.24
32768                5584.44
65536                8690.47
131072              11354.22
262144              13467.84
524288              14828.03
1048576             15621.00
2097152             16045.29
4194304             16266.04

MatColgrove · March 4, 2024, 4:19pm

Yes, CUDA Aware MPI, which uses GPU Direct communication, can be used between GPUs on the same node,.

And modify my OpenMPI setup?

CUDA Aware MPI works by passing in device pointers to the MPI calls, so it’s more a program issue assuming your OpenMPI was built with CUDA Aware MPI enabled (which it looks like yours was).

I haven’t used the OSU benchmarks for quite awhile myself but assume they’ve been updated to use CUDA Aware MPI, in which case you shouldn’t need to do anything extra.

system · March 18, 2024, 4:20pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
HPC SDK 21.09 OpenMPI + lmod + Slurm CUDA Programming and Performance openmpi	0	1592	January 24, 2022
HPC SDK 21.9 is now available Legacy PGI Compilers	3	1069	December 20, 2021
MVAPICH2 and OpenMPI with NVIDIA compiler (20.7) nvc, nvc++ and nvfortran	14	1853	October 8, 2020
Nvfortran with MPI - NVHPC version 23.3 nvc, nvc++ and nvfortran hpc , openmpi	4	1279	May 30, 2023
CUDA+OpenMP+non-Gnu-compiler Having build problems with this combination CUDA Programming and Performance	0	1790	March 31, 2010
Open MPI + PGI 8.04 compilation failure Legacy PGI Compilers	5	10135	February 17, 2009
How use openmp in .cu file ? CUDA Programming and Performance	9	31955	March 31, 2010
openmp in CUDA openmp support in the host code CUDA Programming and Performance	3	22036	December 15, 2007
wrf2.2 with pgi7.0-5 on 64 bit linux cluster and Openmp Legacy PGI Compilers	2	4461	July 11, 2007
MPI+CUDA Fortran Legacy PGI Compilers	4	9060	May 16, 2017

Howto build OpenMPI with nvhpc/24.1

Related topics