Open MPI 3 fails with "No OpenFabrics connection schemes reported that they were able to be used on a specific port."

I installed RedHat 7.5 on two machines with the following Mellanox cards:

87:00.0 Network controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro

I followed the steps outlined here to verify RDMA is working:

https://community.mellanox.com/s/article/howto-enable-perftest-package-for-upstream-kernel

However, I cannot seem to get Open MPI 3.0.2 to work. When I run it, I get this error:


No OpenFabrics connection schemes reported that they were able to be

used on a specific port. As such, the openib BTL (OpenFabrics

support) will be disabled for this port.

Local host: lustwzb34

Local device: mlx4_0

Local port: 1

CPCs attempted: rdmacm, udcm


Then it just hangs till I press control C.

I understand this may be in issue with RedHat or Open MPI or Mellanox… Any ideas to debug which place it could be?

Thanks!

Hi Faraz,

Regarding this message:

"No OpenFabrics connection schemes reported that they were able to be

used on a specific port. As such, the openib BTL (OpenFabrics

support) will be disabled for this port."

This message is related to InfiniBand and it means that mpi didn’t detect any devices with IB connectivity that it can use for MPI.

Are you working in Ethernet mode (RoCE) or InfiniBand?

Regards,

Chen

Thanks. I actually get openmpi to work by doing the following:

  • Reinstalling the redhat infiniband related drivers via yum
  • rebooting the machines
  • Building openmpi from source… The install from the redhat repo does not seem to work.