Running Ubuntu 20.04. Getting these errors from MPI operations:
libibverbs: Warning: couldn't load driver 'libmlx5-rdmav25.so': libmlx5-rdmav25.so: cannot open shared object file: No such file or directory
These libraries are installed on the nodes:
lrwxrwxrwx 1 root root 23 Jun 11 2023 /usr/lib/aarch64-linux-gnu/libibverbs/libmlx5-rdmav34.so -> ../libmlx5.so.1.24.47.0
lrwxrwxrwx 1 root root 15 Jun 11 2023 /usr/lib/aarch64-linux-gnu/libibverbs.so -> libibverbs.so.1
lrwxrwxrwx 1 root root 23 Jun 11 2023 /usr/lib/aarch64-linux-gnu/libibverbs.so.1 -> libibverbs.so.1.14.47.0
-rw-r--r-- 1 root root 125480 Jun 11 2023 /usr/lib/aarch64-linux-gnu/libibverbs.so.1.14.47.0
-rw-r--r-- 1 root root 516368 Jun 11 2023 /usr/lib/aarch64-linux-gnu/libmlx5.so.1.24.47.0
so it’s looking for libmlx5-rdmav25.so but we have libmlx5-rdmav34.so.
These components are installed
ii ibverbs-providers:arm64 2307mlnx47-1.2401033 arm64 User space provider drivers for libibverbs
ii ibverbs-utils 2307mlnx47-1.2401033 arm64 Examples for the libibverbs library
ii libibverbs-dev:arm64 2307mlnx47-1.2401033 arm64 Development files for the libibverbs library
ii libibverbs1:arm64 2307mlnx47-1.2401033 arm64 Library for direct userspace use of RDMA (InfiniBand/iWARP)
ii libibverbs1-dbg:arm64 2307mlnx47-1.2401033 arm64 Debug symbols for the libibverbs library
I haven’t figured out where the libmlx5-rdmav25.so dependency comes from. Are there newer versions of the above available, though?