Unable to work out clag between cumulus and juniper

Dear community,

I have a couple Cumulus switches [Edgecore AS4610 1GbE] that I’m trying to connect to a couple Juniper Core switches using CLAG.

My Cumulus switches have been staked together using peerlink and all that good stuff, and I’m able to create clags between the Cumulus switches to downlink linux servers as a test and it works just fine. However, when I create uplink clags to connect to the Juniper pair, it seems like that the LACP protocol finishes fine but soon after I believe Juniper shuts the ports. I was able to see one ping go through when connecting things. On Cumulus, I’m able to see the Junipers show up as the neighbors (I see their hostnames when I list the bond config), but the clag just won’t remain up.

I’m not a Juniper expert but was told by the other networking team handling them that they’re using MLAG (which is essentially LACP) and have LACP configured to mode active in their switches.

Does anyone have any experience connecting Cumulus switches to Juniper switches? Maybe there’s some parameters that need to be tuned on both sides of the connection? Appreciate any insights on what to look for or any previous experiences you may share.

My configs are attached:
1gsw0101.txt (1.7 KB)
1gsw0102.txt (1.7 KB)

Thank you in advance!

I don’t have any experience with Juniper devices but the behavior you’re describing sounds an awful lot like a Spanning Tree loop avoidance behavior.
In that scenario the port might be up for a few seconds, Cumulus CLAG primary will send a BPDU to the Juniper device, Juniper is set to a port-fast like behavior and instantly disables the port.
I would check the STP settings and BPDU filtering/blocking/portfast configurations.

Hello @epulvino, thank you so much for your insight on the topic.

I’m not an expert in switches neither, but thought that creating a CLAG configuration on Cumulus and having an MLAG (LACP) on Juniper (their equivalent of CLAG) is meant exactly to avoid a loop.

For what I read in the Cumulus documentation, Cumulus AS4610 only supports lacp-rate slow. I don’t think this maps to portfast, though, it’s just the lacp rate.

The config I have on Cumulus for stp is as follows (also in the configs attached):

net add bond uplink stp portadminedge ← equivalent to port-fast
net add bond uplink stp bpduguard ← protect against loops

I was reading about it and documentation says to not have bpduguard enable on ports that interconnect switches. On the other hand, there’s documentation that says it should be enabled to avoid loops (Spanning Tree and Rapid Spanning Tree - STP | Cumulus Linux 4.3).

When you mentioned about checking the STP and BPDU settings, please let me know what else comes to your mind. Also, if you’re familiar with gathering any log or debug info from the Cumulus side regarding STP/BPDU, any ideas of what to look for?

Thank you so much!

So, we got this working. The problem was indeed related to spanning tree.

The Juniper switches had the LACP configured as MC-LAG and that MC-LAG was not running any sort of spanning tree at all. Juniper admins said that MC-LAG is intelligent enough to detect loops, and that we didn’t need to use STP here.

So, I removed the stp configuration from mu uplinks, added bpdufilter to it and things worked just fine. In the end, my uplink configs were as follow, without anything related to stp:

interface uplink
bond-slaves swp47 swp48
bridge-pvid 101
clag-id 47
mstpctl-portbpdufilter yes

Thank you!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.