Error running HPL on mutiple nodes

Command line:
srun -N 2 --ntasks-per-node=8 --cpu-bind=none --mpi=pmix --container-image=“${CONT}” ./hpl.sh --dat /workspace/hpl-linux-x86_64/sample-dat/HPL-H200-16GPUs.dat

/dvs/p4/build/sw/rel/gpgpu/toolkit/r12.8/main_nvshmem/src/modules/transport/common/transport_ib_common.cpp:97: NULL value mem registration failed. Reason: Bad address

/dvs/p4/build/sw/rel/gpgpu/toolkit/r12.8/main_nvshmem/src/modules/transport/ibrc/ibrc.cpp:498: non-zero status: 2 Unable to register memory handle.
/dvs/p4/build/sw/rel/gpgpu/toolkit/r12.8/main_nvshmem/src/host/mem/mem_heap.cpp:931: non-zero status: 7 register_mem_handle failed for remote

/dvs/p4/build/sw/rel/gpgpu/toolkit/r12.8/main_nvshmem/src/host/mem/mem_heap.cpp:1099: non-zero status: 7 register heap memory failed

/dvs/p4/build/sw/rel/gpgpu/toolkit/r12.8/main_nvshmem/src/host/mem/mem_heap.cpp:1534: non-zero status: 7 register heap UC memory failed

/dvs/p4/build/sw/rel/gpgpu/toolkit/r12.8/main_nvshmem/src/host/mem/mem_heap.cpp:533: non-zero status: 1 cuMemAddressFree failed

/dvs/p4/build/sw/rel/gpgpu/toolkit/r12.8/main_nvshmem/src/host/mem/mem_heap.cpp:1591: non-zero status: 7 allocate_physical_memory_to_heap failed

/dvs/p4/build/sw/rel/gpgpu/toolkit/r12.8/main_nvshmem/src/host/proxy/proxy.cpp:130: NULL value failed allocating proxy_channel_g_buf
channel creation failed
srun: error: slurm-compute-node-1: task 11: Exited with exit code 255
slurmstepd: error: mpi/pmix_v5: _errhandler: slurm-compute-node-1 [1]: pmixp_client_v2.c:211: Error handler invoked: status = -61, source = [slurm.pmix.128.0:11]
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.

Hi @gsmith3,
It seems an NVSHMEM problem. Please make sure your system/configs are aligned with NVSHMEM hardware and software requirements (link). You can also run HPL without NVSHEMEM (export HPL_USE_NVSHMEM=0).

For further questions or to provide feedback, please contact HPCBenchmarks@nvidia.com.

Thanks