I recently decided to try out NVHPC (21.3, as it’s the latest I have access to) after a while out of the PGI/NVIDIA game (hi @MatColgrove!) due to the fact our model GEOS does use some F2008 bits and it’s possible/probable that nvfortran doesn’t support them.
But if nothing else, I can get our code ready for it if/when it does. So my first task was to build our base libraries. I’d gotten things like jpeg, zlib, szlib, and hdf4 built, but then it fell apart with HDF5. It fails at the
configure: error: unable to link a simple MPI-IO C program
and when I look at
config.log I see what is a new-to-me error:
configure:29537: checking whether a simple MPI-IO C program can be linked
configure:29559: mpicc -o conftest -fPIC -D_GNU_SOURCE -D_POSIX_C_SOURCE=200809L -I/discover/swdev/gmao_SIteam/Baselibs/ESMA-Baselibs-main-NVHPCtest/x86_64-pc-linux-gnu/nvhpc_21.3
-openmpi_4.0.5/Linux/include/zlib -I/discover/swdev/gmao_SIteam/Baselibs/ESMA-Baselibs-main-NVHPCtest/x86_64-pc-linux-gnu/nvhpc_21.3-openmpi_4.0.5/Linux/include/szlib -lm -L/discover/swdev/
nu/nvhpc_21.3-openmpi_4.0.5/Linux/lib conftest.c -lsz -lz -ldl -lm >&5
ux/lib/libz.so.1: no version information available (required by /gpfsm/dulocal/sles12/nvhpc/Linux_x86_64/21.3/compilers/share/llvm/bin/llc)
/usr/bin/ld: /discover/swdev/gmao_SIteam/Baselibs/ESMA-Baselibs-main-NVHPCtest/x86_64-pc-linux-gnu/nvhpc_21.3-openmpi_4.0.5/Linux/lib/libz.so.1: no version information available (required by
/usr/bin/ld: /discover/swdev/gmao_SIteam/MPI/openmpi/4.0.6/nvhpc-21.3/lib/libopen-rte.so.40: undefined reference to `deflateBound@ZLIB_1.2.0'
Looking at the internet I do see posts where they say this error means your zlib was compiled with a different stack, but
nvc built zlib just before this and seemed happy to do so.
Now, I have tried this in a couple of ways. The first time, I tried using the “built-in” Open MPI 4.0.5 that came with NVHPC 21.3 and it died in much the same way. I stared at that last line about
libopen-rte and I decided to build my own Open MPI stack in case, I dunno, our cluster was “too old” and somehow this was a “NVHPC Open MPI was built on too new a machine” sort of thing. But, no luck. Same failure. (Note I built my Open MPI as close to what the built-in
ompi_info said NVIDIA built theirs with.)
So, do any of the NVHPC gurus here know this error? Perhaps there’s a flag I need to pass zlib?
Good to hear from you. Let me ask Chris to answer since he does all our third party library builds so should be better able to help.
What options are you passing to the
configure script for HDF5?
This is what we use to build HDF5 1.10.5 here:
../configure --prefix=/path/to/install/dir --enable-shared --enable-static --enable-fortran --enable-hl --enable-parallel --with-zlib --with-szlib=/path/to/szlib/dir CC=mpicc FC=mpifort F77=mpifort CPP=cpp CFLAGS="-fPIC -O1 -tp p7-64 -nomp" FCFLAGS="-fPIC -O1 -tp p7-64 -nomp" FFLAGS="-fPIC -O1 -tp p7-64 -nomp"
I have not tried anything more recent, although I note that 1.10.7 seems to be the latest version available now - so hopefully, not much different than 1.10.5.
We tune the executables for base x86_64, since we are required to support every x86_64 type CPU out there. If you do not have this requirement, you can drop the
-tp p7-64 flag.
My configure screen line is:
--disable-shared --disable-cxx --enable-hl --enable-fortran
--disable-sharedlib-rpath --enable-gpfs --enable-parallel
FCFLAGS= CC=mpicc FC=mpifort CXX=mpic++ F77=mpifort
It looks a bit wonky because some flags are filled in differently for different compilers through a make system. You’ll note I point it to the zlib and szlib I build in the same set of libraries.
Let me try your flags for C and Fortran. Maybe it helps? I want to say I tried at least adding
-nomp but it didn’t do anything.
zlib, it was built with:
so nothing fancy.
You can simply add -L/lib/x86_64-linux-gnu/ -lz the first option(order matters) to your CFLAGS and FCFLAGS to resolve the issue.