Nvc/nvc++, shared libs, library runpath and version compatiblity

I’m using the nvhpc compilers to generate a shared object in Ubuntu20. I’ve set up my environment so that it points to the more generic 2023 folder, so PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/2023/comm_libs/mpi/bin:/opt/nvidia/hpc_sdk/Linux_x86_64/2023/compilers/bin:[...] versus using 23.1 (version used here).

However, I noticed that the runpath of a shared object created this way using nvc via cmake contains the exact version:

Library runpath: [/opt/nvidia/hpc_sdk/Linux_x86_64/23.1/compilers/lib:/opt/nvidia/hpc_sdk/Linux_x86_64/23.1/cuda/12.0/lib64:/usr/lib/gcc/x86_64-linux-gnu/9/../../../../lib64]

I was hoping to be able to upgrade the nvhpc version down the road without having to recompile or setting LD_LIBRARY_PATH, which is not going to work with this runpath.

What sort of binary compatibility can I expect between nvhpc releases? Would this work, provided the runpath would be set differently? Is this set by cmake, or an nvc default?

nvc does set the rpath, but I’m not sure about Cmake. With nvc, although you are evoking it via a symlink, it will use paths relative to it’s actual location. Hence the versioned rpaths.

Would this work, provided the runpath would be set differently?

I’m not quite understand the use case here. Are you wanting to build a shared library with one version of the NVHPC compilers, and then use this library with a binary built with later versions of the compiler?

In this case, I believe the runtime libraries used would be those linked against the binary, not what is set in the rpath to the SO. I looked at the output from strace, which seems to confirm this.

What sort of binary compatibility can I expect between nvhpc releases?

My expectation using an SO built with a slightly older compiler with a new version of the compiler will be ok in most cases. However, we can’t guarantee that it will work in all cases. We do add new features so if you took advantage of these in the new binary, it possibly could cause issues with objects built with older versions of the compiler.

Or create the binary with gcc even, but yes. From code with OpenACC pragmas. Think CI platform with a docker image that contains the nvhpc toolkit. It creates packages for artifactory, to be used by other software components. I was thinking of updating the docker image that runs the CI build for the OpenACC code without changing the platform itself that groups the packages, so you could end up with older libs used by newer binaries and the other way around I suppose.

I hadn’t even considered that situation… Good question…

So ideally everything that’s linked together statically or dynamically uses the same version of the hpc sdk Toolkit. Otherwise it may work or not, there are no guarantees? We can work with that, it just means that updating the compiler is more effort and will happen less frequently.

Another question… Are there runtime libraries as a separate package available just for using OpenACC enabled binaries / libs? Or do we need to install the whole SDK for that?

Another question… Are there runtime libraries as a separate package available just for using OpenACC enabled binaries / libs? Or do we need to install the whole SDK for that?

We do provide containers which only include the redistributable runtime libraries which may be incorporated into your docker images. See: NVIDIA HPC SDK | NVIDIA NGC

The full list is provided under the “tags” tab, but the naming convention to get the runtime image is to replace “devel” with “runtime” and specify the CUDA version, from the container names. For example:

The image “nvcr.io/nvidia/nvhpc:23.1-devel-cuda_multi-rockylinux8” would be replaced with “nvcr.io/nvidia/nvhpc:23.1-runtime-cuda11.8-rockylinux8” for the 23.1 runtime libraries targeting CUDA 11.8.

I was thinking more along a conan package or similar for a automated client installation. For a hpc/cloud use-case docker works fine, for a developer pc and a local client… Usually we use conan to install dependencies, especially cross-platform ones. That part does not apply here and apt works I guess, it’s just that the whole nvhpc deb package is monolithic, pretty huge and includes a lot of things that a developer simply using a OpenACC accelerated lib in his software does not need.