Hoping to get IB minimally viable setup working with 1 node 2 HCA-s

I have one host with 2 HCA Mellanox card and wanted to experiment if possible to run IB traffic similar to “loopback” test.

I have installed driver stack from DOCA everything looks OK, mst start, ibstat, mlxconfig working so far. Only that, because no cable is plugged, all ports are down now.

I asked grok on whether it is possible to run some 2-node traffic w/o cable plugged in and ports are still in down state, and it sayd something like

  1. configure cards as eth mode (worked so far)

  2. configure LID (using opensm) and this is where I am having issue:

  3. Run perftest as usual (i.e. ib_write -d mlx5_<#> -i 1)

opensm -g (uuid from ibstat).

if i clear the log below and run cmd above, this is the log I am getting:

nonroot@nonroot-SYS-7049GP-TRT:~/extdir/gg/wget$ cat /var/log/opensm.log

Nov 12 10:52:56 886245 [29290740] 0x03 → OpenSM 5.24.0.MLNX20250722.185f9e32

Nov 12 10:52:56 886311 [29290740] 0x80 → OpenSM 5.24.0.MLNX20250722.185f9e32

Nov 12 10:52:56 892091 [29290740] 0x02 → osm_vendor_init: 1000 pending umads specified

Nov 12 10:52:56 892248 [29290740] 0x02 → osm_vendor_init: 1000 pending umads specified

Nov 12 10:52:56 892336 [29290740] 0x02 → osm_vendor_init: 1000 pending umads specified

Nov 12 10:52:56 897600 [29290740] 0x02 → osm_tenant_mgr_init: tenant mgr is disabled

Nov 12 10:52:56 897770 [29290740] 0x80 → Entering DISCOVERING state

Nov 12 10:52:56 897945 [29290740] 0x02 → osm_issu_mgr_init: issu_mgr is initialized

Nov 12 10:52:56 898142 [29290740] 0x02 → osm_vendor_rebind: Mgmt class 0x81 binding to port GUID 0xba599ffffe431320

Nov 12 10:52:56 904975 [29290740] 0x01 → osm_vendor_rebind: ERR 5426: Unable to register class 129 version 1

Nov 12 10:52:56 904992 [29290740] 0x01 → osm_sm_mad_ctrl_bind: ERR 3118: Vendor specific bind failed

Nov 12 10:52:56 904997 [29290740] 0x01 → osm_sm_bind: ERR 2E10: SM MAD Controller bind failed (IB_ERROR) for port guid 0xba599ffffe431320, port index 0

Nov 12 10:52:56 908933 [29290740] 0x02 → osm_tenant_mgr_destroy: osm_tenant_mgr_destroy complete

Nov 12 10:52:56 908960 [29290740] 0x02 → osm_issu_mgr_destroy: osm_issu_mgr_destroy complete

Nov 12 10:52:56 909064 [29290740] 0x80 → Exiting SM

Here is the grok instruction:

ibstat | egrep “mlx|UUID|State|GU”
CA ‘mlx5_0’
Node GUID: 0xb8599f0300431320
System image GUID: 0xb8599f0300431320
State: Down
Port GUID: 0xba599ffffe431320
CA ‘mlx5_1’
Node GUID: 0xb8599f0300431321
System image GUID: 0xb8599f0300431320
State: Down
Port GUID: 0xba599ffffe431321
CA ‘mlx5_2’
Node GUID: 0x0c42a103004a6b00
System image GUID: 0x0c42a103004a6b00
State: Down
Port GUID: 0x0e42a1fffe4a6b00
CA ‘mlx5_3’
Node GUID: 0x0c42a103004a6b01
System image GUID: 0x0c42a103004a6b00
State: Down
Port GUID: 0x0e42a1fffe4a6b01

Just a quick reminder that running InfiniBand traffic without a physical link is not possible. IB ports must be in Active state for LID assignment and routing, which requires a cable or switch connection. OpenSM cannot bring ports up or assign LIDs when the link is down, so perftest will not work in this scenario.

If you’d like to test on a single host, the supported approach is to connect two ports back‑to‑back with an appropriate IB cable and then run OpenSM and perftest.

Also the 2 HCA mode/Type must be IB rather than Ethernet.

For each HCA, set both ports to IB = 2, run the following command to change it back to IB mode and reboot server to make it happen

sudo mlxconfig -d <PCI_of_HCA1> set LINK_TYPE_P1=2 LINK_TYPE_P2=2
sudo mlxconfig -d <PCI_of_HCA2> set LINK_TYPE_P1=2 LINK_TYPE_P2=2

In simple words:
HCA must be IB Mode and the 2 IB HCA must be physically connected, then start up opensm, the LID can be assigned.