3 compute node switchless

Hello: I have three computing nodes equipped with ConnectX-4 dual port Mellanox cards.

Each compute node is directly connected with the other two nodes (sort of hyper-cube).

If I start a subnet on two nodes, I’m able to start an MPI (RDMA) job on those two nodes. If I start two subnets and try to execute my application on the three nodes, the MPI processes are started on all compute nodes, but after a few seconds the job fails.

I tried to follow this suggestion:


but it doesn’t seems to be working in my case.

Can anyone help me understand how to configure this kind of setup?

Thank you !


Hi Emanuele,

Can you please provide the error that you are seeing when the job fails?



Hi Chen,

sorry for the late reply

I have 3 nodes called DUMBO, TIMOTEO and JIMCORVO

this is part of /etc/hosts on Timoteo: TIMOTEO21 TIMOTEO tim-ib TIMOTEO23 jimcorvo12 JIMCORVO jim-ib jimcorvo13 DUMBO32 DUMBO dumbo-ib DUMBO31

I try to execute the job with command:

mpirun -genvall -genv I_MPI_HYDRA_DEBUG 1 -genv I_MPI_FABRICS=shm:ofi -n 24 -ppn 8 -hostfile hostfile ./wrf.exe

“hostfile” contains the names of the 3 nodes

the application is reporting errors like:

Abort(1014056975) on node 7 (rank 7 in comm 0): Fatal error in PMPI_Comm_dup: Other MPI error, error stack:

PMPI_Comm_dup(179)…: MPI_Comm_dup(MPI_COMM_WORLD, new_comm=0x7ffe1f481868) failed




MPIR_Get_contextid_sparse_group(498): Failure during collective

I’m also linking the console output that I’m getting from mpiexec and the pcap file collected by ibdump on one of the two port on Timoteo



Please let me know if further details are required

thanks in advance!