When connecting the CX8 NIC to the Q3400 switch, the subnet manager cannot be activated on either side

Both ends display as follows:

(base) root@node-18:~# opensm

OpenSM 5.21.12.MLNX20250617.f74e01b8

Command Line Arguments:
Log File: /var/log/opensm.log

OpenSM 5.21.12.MLNX20250617.f74e01b8

Using default GUID 0x5000e6030005562a
Entering DISCOVERING state

Error from osm_opensm_bind (0x2A)
Perhaps another instance of OpenSM is already running
Exiting SM

(base) root@node-18:~# sminfo
ibwarn: [16405] mad_rpc_open_port: client_register for mgmt 1 failed
sminfo: iberror: failed: Failed to open ‘(null)’ port ‘0’

root@nvos:~# nv show system
operational applied


uptime 1 day, 16:28:08
hostname nvos
product-name nvos
product-release 25.02.5002
status System is ready
date-time
local-time 2025-09-11 06:31:25
timezone Etc/UTC Etc/UTC
health
status Not OK
version
product-release 25.02.5002
root@nvos:~# ib sm
-bash: ib: command not found
root@nvos:~# sminfo
ibwarn: [587839] _do_madrpc: recv failed: Connection timed out
ibwarn: [587839] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 0; 0)
sminfo: iberror: failed: query

Hello,

This looks a known issue. Please update to newer version.

OpenSM 5.22.11.MLNX20250130.12243119 will fix the problem.

Just to add to my last comment. SM is not supported on the XDR switch system.

In our case, the cleanest solution was aligning firmware + management stack rather than fighting OpenSM on the host. Once everything was on a supported combo, CX8 links came up immediately.

It may be worth checking with a vendor that’s already deployed Q3400 + CX8 fabrics—they usually know which SM mode actually works in practice.

For reference, we had decent experience working with NADDOD on a similar Q3400 + ConnectX-8 bring-up. They were already familiar with the SM behavior and firmware combinations, which saved a lot of trial-and-error on our side.

2 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.