MLAG - proper active/active iSCSI cabeling

Hello community,

i have a fairly simple situation actually, but apparently surprisingly many options.

I have the following deployment:

  • 2 SN2700 switches with the latest ONYX, which i want to configure in MLAG,
  • 1 iSCSI-SAN-Array, which should be configured in a so called “symmetric active/active” mode: both controllers are active, and connections to the LUNs are dynamically alloced to controller and ports by the array itself. Connected switch ports are expected in LACP mode 4
  • 3 downstream hosts are LINUX virtualisation hosts (CentOS,8.1), connecting to the LUNs via “DM-Multipath”, and connecting to MLAG interfaces on the switches.

Each array controller features two “channels” (ethernet interfaces), which apparently correspond to the other controller:

  • Controller A: channel3, channel4
  • Controller B: channel3, channel4

The manufacturer of the array seems to recommend to link both channels per controller to the same switch, but without taking into account, how these switches actually are configured, i.o.w.: whether and how they “cooperate”.

But the MLAG documentation by MELLANOX for gerneral downstream devices recommends to connect both links of each LAG of each “downstream host” to different switches, which i find understandable.

My understanding is that the following connection-schema would therefore conform with the MELLANOX recommendation for connect downstream devices to MLAG interfaces:

CTLA:ch3 => SW1:p3

CTLA:ch4 => SW2:p3

CTLB:ch3 => SW1:p4

CTLB:ch4 => SW2:p4

While the manufacturer of the array seems to recommend:

CTLA:ch3 => SW1:p3

CTLA:ch4 => SW1:p4

CTLB:ch3 => SW2:p3

CTLB:ch4 => SW2:p4

albeit with explicitely ignoring any switch-configuration (MLAG, Stack, …).

If all there can be known about the inner workings of the “symmetric active/active” mode of the array is,

  • that it attempts to evenly spread the load over all “ports”/“channels”,
  • while ports apparently relate accross the controllers
  • and that it expects the related switch ports to be configured in LACP mode 4 (“802.3ad (LACP)”,

what would you think which is the better/correct approach of wiring up that rig?

I’d highly appreciate any of your thoughts on this,

best,

Hilmar

Hello Hilmar,

Depending on your design goals either proposed solution could work. In order to best assist you and your specific use case you can explore upgrading your support contract level or procure a design review from our professional services team by emailing networking-support@nvidia.com.

Below are some general considerations which I gathered from the information you’ve provided which may be able to help you make a more informed decision.

When using the manufactures recommendation of connecting the storage controllers to only one switch respectively you would gain a predictable traffic pattern. If you connect each storage controller to separate clustered MLAG switches you would gain more efficient link utilization through load balancing and increased availability through physical link redundancy.

Load balancing between the cluster can be manipulated via the global configuration mode, see below for syntax example.

sn2100-02 [mlag-vip: master] (config) # port-channel load-balance ethernet ?

destination-ip Destination IP address

destination-mac Destination MAC address

destination-port Destination UDP/TCP port

flow-label IPv6 flow-label field

ingress-port Ingress port

l2-protocol Ethertype field

l3-protocol IP protocol field

source-destination-ip Source and destination IP addresses

source-destination-mac Source and destination MAC addresses

source-destination-port Source and destination UDP/TCP ports

source-ip Source IP address

source-mac Source MAC address

source-port Source UDP/TCP port

symmetric Symmetric hashing; bidirectional flows follow same path

A validated MLAG configuration for Mellanox switches can be found here at this link. And if you wish to combine this setup with a L3 first hop redundancy protocol it can be coupled with this MAGP configuration guide.

I hope this information has been helpful.

Best,

Brian T.

Global Technical Support

NVIDIA Networking

Hello Brian,

thank you very much for helping so fast and looking into my question!

The MLAG Guide you recommended is really good and ultimately helped me to come to the conclusion that the general recommendation of the array manufacturer would possibly be not the best solution in our specific environment.

It definitely helps to hear from you that fundamentally both options are valid nevertheless, just fitting different scenarios.

Fortunately the upstream L3 situation is nothing i have to deal with, which helps keeping the complexity at bay ;-)

Since i have a mixed usecase (virtualisation, HPC), i’ll have to test and measure anyways and hope it will help to decide upon the best option.

Thank you very much again so far,

best

Hilmar