How to do L2 East-West Traffic with IPL and MLAG?

Hi,

We have multiple pairs of HPe SN2410M switches (they’re just Mellanox switches painted a different colour, hence why I 'm here!). We have them configured as MLAG pairs with a VIP, and they work well. We’re upgrading them to Onyx v3.9.1306 at the moment.

We also have some Nexus 9K switches in exactly the same configuration - MLAG pairs (or vPC as Cisco call it). In all cases all the L3 routing is done by core switches upstream. These switches are all top-of-rack for servers, so everything is MLAGs down to the hosts. The HPe switches have mgmt0 connected back to back for the VIP, and the IPL goes over a port-channel of two 100G DACs.

However, one major difference I notice between the HPe and the Nexus is that the switch-to-switch connectivity is very different. The Nexus will put traffic through the connection between the switches, so there’s East-West communication. The HPes don’t - the documentation clearly says the IPL will only transfer traffic in a failure scenario (i.e. the uplinks die).

All I’m getting at is we have two 100G connections doing relatively nothing - I could’ve put in 10G and saved a whole load of money!

Can we/should we do anything to enable the 100G DACs to be useful for traffic? Or is this just how it is with these switches? I’ve considered putting in a couple of 10G connections and moving the IPL over to them so I can then use the 100G DACs for a L2 MLAG, but then there’s all the fun of Spanning Tree loops to consider (something the Nexus figures out for itself and Just Works without any major fiddling).

What’s considered the right thing to do here? We’re running Nutanix Hyperconvergence if that makes any difference, but there’s other servers also attached.

Thanks in advance.

Hi David,

traffic will pass over the IPL in the following cases:

  1. The destination mac is learnt from a port on the peer MLAG switch
  2. BUM traffic

@David Rickard​ , did you end up adding an L2 transit link between the switches? I am facing a similar dilemma, while I will be point-to-point routed northbound into a pair of Cisco Nexus switches, I need to peer OSPF with them as well.

Normally, I would peer the two OSPF processes between the MLAG’d switches as well (as we would with Nexus vPC, Aruba VSX, et. al.), so the advertised topology remains consistent, but the published IPL behavior has me searching for answers. Do I let an MLAG port failure trigger an OSPF topology change (because the VLAN(s) will now transit the IPL, an OSPF DR should be elected, etc.), or let STP knock down the loop formed if I have an L2 transit in play (and would the IPL even participate in STP?).

Eddie, sorry for not replying sooner. I thought I’d get a notification and never did. I just googled this again and found my own post!

I had a look and I can see some MACs in the address table via the IPL, so I think I can now see what’s going on. For single-homed devices they definitely do traverse the IPL. For MLAGs they seem to show the same MAC on both switches via the MLAG, so I’m guessing the switch is smart enough to know when to keep it on the local switch, or send it across the IPL? It certainly seems that doesn’t happen often as there’s only around 7Mb/s going over it. Yet there’s two HCI clusters on it so all their traffic must be staying local to one switch (possibly the active). Either way I don’t think I’m seeing traffic hair-pinning via the core switch, which was really all I wanted to avoid.