Hi,
I have a HDR network, where I have multiple nodes connected. Now I am trying to connect a 9790 NDR switch to this HDR network with the help of a copper OSFP Finned to 2x QSFP cable. The 9790 switch is further connected by a node which has a Connect X-7 card, and now I want to make the node show up in the entire network. This is where the issues arise.
The switch and the NDR node show up in my network if I do ibswitches, however, on the node I can not get the infiniband interface to go up.
ibstatus shows:
CA 'ibp26s0'
CA type: MT4129
Number of ports: 1
Firmware version: 28.38.1002
Hardware version: 0
Node GUID: 0xa088c203006043b6
System image GUID: 0xa088c203006043b6
Port 1:
State: Active
Physical state: LinkUp
Rate: 400
Base lid: 109
LMC: 0
SM lid: 221
Capability mask: 0xa751e848
Port GUID: 0xa088c203006043b6
Link layer: InfiniBand
whereas dmesg gives me the error:
ibp26s0: multicast join failed for ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -22
I have tried changing the MTU from 4092 to 2044 but that also does not work.
ibportstate gives:
ibportstate 109 1
CA/RT PortInfo:
# Port info: Lid 109 port 1
LinkState:.......................Active
PhysLinkState:...................LinkUp
Lid:.............................109
SMLid:...........................221
LMC:.............................0
LinkWidthSupported:..............1X or 4X or 2X
LinkWidthEnabled:................1X or 4X or 2X
LinkWidthActive:.................4X
LinkSpeedSupported:..............2.5 Gbps
LinkSpeedEnabled:................2.5 Gbps
LinkSpeedActive:.................Extended speed
LinkSpeedExtSupported:...........14.0625 Gbps or 25.78125 Gbps or 53.125 Gbps or 106.25 Gbps
LinkSpeedExtEnabled:.............14.0625 Gbps or 25.78125 Gbps or 53.125 Gbps or 106.25 Gbps
LinkSpeedExtActive:..............106.25 Gbps
Mkey:............................<not displayed>
MkeyLeasePeriod:.................0
ProtectBits:.....................0
This is all on debian12, with native debian drivers with an unconfigured, out of the box opensm. I want to know what am I missing? Is it a misconfiguration of opensm where NDR speeds are not allowed on HDR networks? Currently my opensm is running on a HDR node.
I have another node (Connect X-7 NDR) which is connected directly to the HDR 8790 switch via a flat OSFP to 2x QSFP cable and it works fine and I can use IPoIB and mount things over RDMA.