libnvJitLink.so.12 not being linked automatically

Hi,

I am compiling POT3D (GitHub - predsci/POT3D: POT3D: High Performance Potential Field Solver) for the GPU including the cusparse option.

This worked in the past (previous versions of the compiler), but now, while the code compiles, it cannot be run due to a missing link:

libnvJitLink.so.12 => not found

I am using the -cudalib=cusparse for my nvc and nvfortran compilation.

I am activating the compile environment with:

version=24.3
NVARCH=`uname -s`_`uname -m`; export NVARCH
NVCOMPILERS=/opt/nvidia/hpc_sdk; export NVCOMPILERS
MANPATH=$MANPATH:$NVCOMPILERS/$NVARCH/$version/compilers/man; export MANPATH
PATH=$NVCOMPILERS/$NVARCH/$version/compilers/bin:$PATH; export PATH
export PATH=$NVCOMPILERS/$NVARCH/$version/comm_libs/openmpi/openmpi-3.1.5/bin:$PATH
export MANPATH=$MANPATH:$NVCOMPILERS/$NVARCH/$version/comm_libs/openmpi/openmpi-3.1.5/man

For all other libraries (including libcudart.so.12 which is in the same directory as libnvJitLink.so.12 ), I do not need to use an explicit LD_LIBRARY_PATH (presumably since the compiler is using some sort of rpath?).

It would be useful to also have this library be linked in the same manner to avoid having to set an LD_LIBRARY_PATH in the compiler environment setup.

Thanks

– Ron

Hi Ron,

libnvJitLink.so is new in CUDA 12 so likely why it worked before. Though for good or bad, I’m not able to recreate the issue here. While we don’t link this library directly, rather it’s a dependent library for libcusparse.so, we do use the rpath to the CUDA runtime libraries at link, so the loader should be able to find it.

Can you run “ldd” on libcusparse.so and your executable so we can see how the loader is resolving the paths?

For example, here’s what it looks link on my system. Note that I do not have LD_LIBRARY_PATH set.

% ldd /proj/nv/Linux_x86_64/24.3/math_libs/12.4/lib64/libcusparse.so.12
        linux-vdso.so.1 (0x00007ffc5f9ab000)
        libnvJitLink.so.12 => /proj/nv/Linux_x86_64/24.3/math_libs/12.4/lib64/../../../../../cuda/12.4/lib64/libnvJitLink.so.12 (0x0000151fabcc7000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x0000151fabaa8000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x0000151fab8a0000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x0000151fab69c000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x0000151fab2fe000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x0000151fab0e6000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x0000151faacf5000)
        /lib64/ld-linux-x86-64.so.2 (0x0000151fc00a5000)

Also, what CUDA version is being used? (if you don’t know, check the CUDA driver via nvidia-smi)
Besides -cudalib=cusparse, what other flags do you use on the link?

Thanks,
Mat

Hi,

I get:

PREDSCI-GPU2-NV2403: ~/Desktop $ ldd /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/math_libs/12.3/lib64/libcusparse.so.12
	linux-vdso.so.1 (0x00007fff33782000)
	libnvJitLink.so.12 => not found
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007150d16eb000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007150d16e6000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007150d16df000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007150c14ee000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007150c14ce000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007150c1200000)
	/lib64/ld-linux-x86-64.so.2 (0x00007150d16f2000)

and

PREDSCI-GPU2-NV2403: ~/Desktop $ nvidia-smi
Tue May  7 15:49:25 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.29.06              Driver Version: 545.29.06    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 3090 Ti     On  | 00000000:01:00.0 Off |                    0 |
|  0%   35C    P8              24W / 300W |      3MiB / 23028MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

For flags, I am using:

-O3 -march=native -stdpar=gpu -acc=gpu -gpu=cc86,cc89,nomanaged,nounified -Minfo=accel -cudalib=cusparse

I am also linking to a self-compiled hdf5 library.

-L/opt/psi/nv/ext_deps/deps/hdf5/lib -lhdf5_fortran -lhdf5hl_fortran -lhdf5 -lhdf5_hl

– Ron

Thanks Ron.

I did a local install of 24.3. While the loader was able to find libnvJitLink, it found it under a local CUDA install rather than the one that we ship with NVHPC.

Likely what’s going on is that since it isn’t on the link line, given it’s a dependent library, the loader doesn’t use the rpath for it.

I think the solution here is for us to implicitly add it to the link line so it’s rpath gets set. I’ve opened a report, TPR #35636.

Besides setting LD_LIBRARY_PATH, a work around for you would be to add “-lnvJitLink” to your link.

OK thanks!

– Ron

Hi,

On a possibly related note, when I try to install POT3D on WSL on Windows 11, I get an error saying it cannot find “-lcuda”.

I have the NV HPC SDK installed, but I do NOT have the CUDA toolkit installed on either the WSL or windows directly. I do have the NVIDIA App installed on Win11 and the latest game-ready drivers.

I also add to my activation of the compiler:

export LD_LIBRARY_PATH=/usr/lib/wsl/lib:${LD_LIBRARY_PATH}

This allows GPU codes to run (I can successfully run HipFT GitHub - predsci/HipFT: High-performance Flux Transport).

I found that if I install libcudart11 from apt, the code compiles but then I assume its using CUDA11 which I don’t want.

Instead, I found that if I add -L/usr/lib/wsl/lib to my Makefile, the code compiles.

Is this missing -lcuda another library missed by rpath when compiling on NVHPC without the CUDA toolkit installed like the one above?

    • Ron

This is because the CUDA driver (libcuda.so) gets installed in a non-default location on WSL. Hence the linker and loader can’t find it (at least by default)

Though, you shouldn’t need to link with -lcuda unless you’re calling the CUDA device API directly. Typically, this gets dynamically loaded (via dlopen) at runtime. Because of this, I don’t think (but not sure) that adding an rpath to it would help. You’ll likely need to add the path to the LD_LIBRARY_PATH. Adding it to your shell config file (like .bashrc) will save you having to set it each time.

Hi Ron,
Our 24.5 release is now out, and we made a change for FS#35636 referenced above, and now libnvJitLink will be added to the link line when you use cusparse with -cudalib.

FYI, I use WSL on my laptop and have the same issue as you. I have this in my .bashrc:
export PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/24.3/compilers/bin:$PATH
export LD_LIBRARY_PATH=/usr/lib/wsl/lib

Thanks!