Configuring inter-switch links between SX6036 QSFP ports

Apologies in advance as I am new to posting.

I am responsible for configuring 2 SX6036 switches. Another team is responsible for configuring the server and Infiniband adapters on the servers.

This is the network topology:

IBM Knowledge Center https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.1.0/com.ibm.db2.luw.qb.server.doc/doc/t0059424.html

The only deviation is there are 3 member servers, not 4 and each member is patched to the IB switches in the same manner as the CF servers (2 adapters each with 2 ports) .

The subnet manager is running on both SX6036 switches. HA is running over the switch management ports with one switch set with a higher priority (master) and the other is standby. My understanding is that there is no other configuration required on the switches.

  1. How are the inter-switch links used? My understanding is that they are only used if there is an adapter failure on a server. True or False?

  2. If an adapter fails, how do the inter-switch links kick in? Are they automatically configured to switch any IB traffic on any IB subnet? The servers are configured with IP over IB.

  3. The diagram refers to the inter-switch links being setup as a LACP channel. Is this done automatically or is there some configuration I need to configure?

Thanks in advance for any help!

Greg

Hi Greg,

Infiniband architecture is a bit different than Ethernet in a way where the protocol uses all the ISL connections all together. In short, the Subnet Manager (SM), calculate and provision static routes between every end point (HCA port) to every other end point. This is typically done using a set of rules that are relative to the topology and are configuring those routes while avoiding the risk of having a loop in the network. The configuration also aims to spread the routes as evenly as possible across all the available links to maximize the BW utilization within the fabric itself.

If a situation occurs where a link goes down during the life-cycle of the network, the SM has the way to identify the event (usually receives a trap) and recalculate the routes again using alternative routes. This is all done automatically - no need to configure anything.

Lastly, the notion of “LACP” does not exist in Infiniband - it is just not needed.

I hope this helps. You can read more about the Infiniband architecture online. There are many resources at different levels.

Cheers!

NP Greg. My pleasure.

Thanks Yairi. It sounds like my configuration is good. Some IBM guys who recommended it for their Purescale solution checked my configuration and advised it is fine. We had a DB2 failure when one Mellanox switch was failed (tested) and continue to try to isolate the problem. If DB2 fails again with the next test, they have asked that a ticket be opened with IBM support.

thanks again for helping me further my knowledge on this.

Greg

I found something in the switch log that points to something that may be missing from setting up the switches:

“Master pm[5916]: (pid 6878): Found remote SM (0,31,1) with non-matching sm_key”

We were running more tests and all members failed in DB2 after all the CF links to the HA standby switch were disconnected. It was expected that the HA master switch would be the active SM and all active links would be sufficient to keep DB2 running.

Any ideas? Is there some sort of licence coordination that is required?

Thanks,

Greg