How can I run MPI between different version of HCAs (eg. ConnectX-3 and ConnectX-4) in a subnet?

Hello,

I’m trying to run OSU mirobenchmark (osu_bw), between two servers (and among servers)

especially, by using different versions of HCAs.

Here is my servers configuration :

  • Server FDR_S1 : CX455A-FCAT (single port, mlx5_0), as ib0: net device
  • Server FDR_S2 : CX455A-FCAT (single port, mlx5_0) as ib1: net device
  • Server FDR_S3 : CX354A-FCAT (dual port, mlx4_1:2) as ib2: net device
  • Server FDR_S4 : CX354A-FCBT (dual port, mlx4_1:2) as ib2: net device
  • Server FDR_S2,3,4 have another HCA (CX353A-QCBT, QDR) but not used.
  • Servers are connected via FDR switch
  • SWs on CentOS 7.6
    • OpenMPI 1.10.2
    • MLNX_OFED_LINUX-4.6-1.0.1.1
    • OSU microbenchmark v5.6.1

“osu-bw” program between identical HCA works well, but it fails to run between different HCAs.

The following is command for mpirun:

mpirun -np 2 -npernode 1 -H FDR_S1,FDR_S4 \

–mca btl openib,self \

-mca btl_openib_if_include mlx5_0,mlx4_1:2 \

–bind-to core -cpu-set 0,1,2 \

/usr/local/osu-benchmarks-5.6.1/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_bw

The results shown as this:

OSU MPI Bandwidth Test v5.6.1

Size Bandwidth (MB/s)

libibverbs: ibv_create_ah failed to query port.

[fdr_s2:17360] *** An error occurred in MPI_Isend

[fdr_s2:17360] *** reported by process [1401946113,0]

[fdr_s2:17360] *** on communicator MPI_COMM_WORLD

[fdr_s2:17360] *** MPI_ERR_OTHER: known error not in list

[fdr_s2:17360] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,

[fdr_s2:17360] *** and potentially your MPI job)

osu-bw run between identical HCAs (eg, FDR_S1 <-> FDR_S2, and FDR_S3 <-> FDR_S4) works well, even without “btl_openib_if_include” options.

Inc ase of assigning one of hca type - like “-mca btl_openib_if_include mlx_4_1:2” -,

other kinds of error occured:


WARNING: One or more nonexistent OpenFabrics devices/ports were

specified:

Host: fdr_s1

MCA parameter: mca_btl_if_include

Nonexistent entities: mlx4_1:2

These entities will be ignored. You can disable this warning by

setting the btl_openib_warn_nonexistent_if MCA parameter to 0.



At least one pair of MPI processes are unable to reach each other for

MPI communications. This means that no Open MPI device has indicated

that it can be used to communicate between these processes. This is

an error; Open MPI requires that all MPI processes be able to reach

each other. This error can sometimes be the result of forgetting to

specify the “self” BTL.

Process 1 ([[20510,1],0]) is on host: fdr_s1

Process 2 ([[20510,1],1]) is on host: FDR_S4

BTLs attempted: self

Your MPI job is now going to abort; sorry.



MPI_INIT has failed because at least one MPI process is unreachable

from another. This usually means that an underlying communication

plugin – such as a BTL or an MTL – has either not loaded or not

allowed itself to be used. Your MPI job will now abort.

You may wish to try to narrow down the problem;

  • Check the output of ompi_info to see which BTL/MTL plugins are

available.

  • Run your application with MPI_THREAD_SINGLE.

  • Set the MCA parameter btl_base_verbose to 100 (or mtl_base_verbose,

if using MTL-based communications) to see exactly which

communication plugins were considered and/or discarded.


*** An error occurred in MPI_Init

*** on a NULL communicator

*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,

*** and potentially your MPI job)

[fdr_s1:19147] Local abort before MPI_INIT completed successfully; not able to aggregate error messages, and not able to guarantee that all other processes were killed!

[fdr_s4][[20510,1],1][…/…/…/…/…/ompi/mca/btl/openib/btl_openib_proc.c:157:mca_btl_openib_proc_create] […/…/…/…/…/ompi/mca/btl/openib/btl_openib_proc.c:157] ompi_modex_recv failed for peer [[20510,1],0]

*** An error occurred in MPI_Init

*** on a NULL communicator

*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,

*** and potentially your MPI job)

[fdr_s4:18841] Local abort before MPI_INIT completed successfully; not able to aggregate error messages, and not able to guarantee that all other processes were killed!


Primary job terminated normally, but 1 process returned

a non-zero exit code… Per user-direction, the job has been aborted.



mpirun detected that one or more processes exited with non-zero status, thus causing

the job to be terminated. The first process to do so was:

Process name: [[20510,1],0]

Exit code: 1


Does anyone solved this kind problem?

Best regards

I’m close this case.

I just dolved the problem. It was due to the confusion of net device in the servers.