Hello, I’m trying to run NVIDIA’s DOCA dpa_all_to_all sample( NVIDIA DOCA DPA All-to-all Application Guide - NVIDIA Docs ) on BlueField-3 ARM SoC, but it fails during MPI_Init when UCX tries to initialize.
Running the program on the host CPU works nice, but they don’t run when I try it on the BF-3 ARM SoC.
Platform / software
What I have tried
ulimit unlimited
Tried export UCX_IB_MLX5_DEVX_UAR=n, export UCX_IB_MLX5_DEVX=n
Checked RDMA sanity
Failure log
mpirun -np 2 ./doca_dpa_all_to_all
[2026-01-26 05:03:53:920797][2667169888][DOCA][INF][doca_log.cpp:633] DOCA version 3.2.1025
[2026-01-26 05:03:53:922579][903727200][DOCA][INF][doca_log.cpp:633] DOCA version 3.2.1025
[1769403834.150458] [localhost:1925108:0] ib_iface.c:1315 UCX ERROR mlx5_2: iface 0xb76343ef3500 failed to create UD QP TX wr:256 sge:6 inl:64 resp:0 RX wr:4096 sge:1 resp:0 failed: Cannot allocate memory
[1769403834.151192] [localhost:1925108:0] ucp_worker.c:1415 UCX ERROR uct_iface_open(ud_verbs/mlx5_2:1) failed: Input/output error
[localhost.localdomain:1925108] pml_ucx.c:314 Error: Failed to create UCP worker
[1769403834.166861] [localhost:1925107:0] ib_iface.c:1363 UCX ERROR mlx5_2: ibv_create_cq(cqe=256) failed: Invalid argument
[1769403834.166945] [localhost:1925107:0] ucp_worker.c:1415 UCX ERROR uct_iface_open(ud_verbs/mlx5_2:1) failed: Input/output error
[localhost.localdomain:1925107] pml_ucx.c:314 Error: Failed to create UCP worker
xyin
February 5, 2026, 3:18am
2
Hi dragonj5290
Since you have tried to disable the DEVX, can you also try to ensure the ulimit take effect?
export UCX_IB_MLX5_DEVX=n
export UCX_IB_MLX5_DEVX_UAR=n
ulimit -l unlimited
you can also run ulimit -l to check whether it’s unlimited and run mpirun again.
if you still encounter a failing run, check below log
dmesg | egrep -i 'mlx5_cmd_check|ALLOC_UAR|limits exceeded|Cannot allocate memory' | tail -n 50
if you can see messages such as limits exceeded/ALLOC_URA failed, it can clearly indicate HCA side UAR resource exhaustion, then you can consider modifying the URA-related configuration by mlxconfig command.
regards
xyin
Thank you for your answer.
However, I have tried them only to run into the exact same error as before.
Strangely, dmesg command showed nothing even with the sudo.
xyin
February 9, 2026, 6:56am
4
Hey Dragonj5290
Thanks for the update,
for ensuring the ulimit takes effect, can you check whether mpirun has applied the change?
mpirun -np 1 bash -c 'echo "memlock: $(ulimit -l)"; ulimit -a'
besides, can you see any logs like mlx5 like
dmesg | grep mlx5
thanks
xyin
mpirun -np 1 bash -c ‘echo “memlock: $(ulimit -l)”; ulimit -a’
memlock: unlimited
real-time non-blocking time (microseconds, -R) unlimited
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 31415
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 1048576
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 31415
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Dear xyin
Thank you for your constant help.
The results seems fine but I’m still struggling to run the program.
Thank you