We are evaluating SONiC running in an SN2410 switch and currently experimenting what seems to be a strange behaviour regarding MTU.
The version we are running is:
Version
SONiC Software Version: SONiC.master.104880-bd91b2eef
Distribution: Debian 11.3
Kernel: 5.10.0-12-2-amd64
Build commit: bd91b2eef
Build date: Tue May 31 18:47:36 UTC 2022
Built by: AzDevOps@sonic-build-workers-001KB2
Platform: x86_64-mlnx_msn2410-r0
HwSKU: ACS-MSN2410
ASIC: mellanox
ASIC Count: 1
Serial Number: MT2135J13918
Model Number: MSN2410-CB2FC
Hardware Revision: A2
We have a working LAG with 2x100G interfaces:
Portchannel
Flags: A - active, I - inactive, Up - up, Dw - Down, N/A - not available,
S - selected, D - deselected, * - not synced
No. Team Dev Protocol Ports
----- --------------- ----------- -----------------
0001 PortChannel0001 LACP(A)(Up) etp49(S) etp50(S)
By default MTU is set to 9100:
Default MTU
Interface Lanes Speed MTU FEC Alias Vlan Oper Admin Type Asym PFC
--------------- ------- ------- ----- ----- ------- ------ ------ ------- ------ ----------
PortChannel0001 N/A 200G 9100 N/A N/A trunk up up N/A N/A
Interface Lanes Speed MTU FEC Alias Vlan Oper Admin Type Asym PFC
----------- --------------- ------- ----- ----- ------- --------------- ------ ------- --------------- ----------
Ethernet192 192,193,194,195 100G 9100 N/A etp49 PortChannel0001 up up QSFP28 or later N/A
Interface Lanes Speed MTU FEC Alias Vlan Oper Admin Type Asym PFC
----------- --------------- ------- ----- ----- ------- --------------- ------ ------- --------------- ----------
Ethernet196 196,197,198,199 100G 9100 N/A etp50 PortChannel0001 up up QSFP28 or later N/A
We change the MTU to 9216 as this is our desired configuration:
MTU Changed to 9216
Interface Lanes Speed MTU FEC Alias Vlan Oper Admin Type Asym PFC
----------- --------------- ------- ----- ----- ------- --------------- ------ ------- --------------- ----------
Ethernet192 192,193,194,195 100G 9216 N/A etp49 PortChannel0001 up up QSFP28 or later N/A
Interface Lanes Speed MTU FEC Alias Vlan Oper Admin Type Asym PFC
----------- --------------- ------- ----- ----- ------- --------------- ------ ------- --------------- ----------
Ethernet196 196,197,198,199 100G 9216 N/A etp50 PortChannel0001 up up QSFP28 or later N/A
Interface Lanes Speed MTU FEC Alias Vlan Oper Admin Type Asym PFC
--------------- ------- ------- ----- ----- ------- ------ ------ ------- ------ ----------
PortChannel0001 N/A 200G 9216 N/A N/A trunk up up N/A N/A
The LAG is a trunk with just a few VLANs. For one of these VLANs we have a L3 interface:
| VLAN ID | IP Address | Ports | Port Tagging | Proxy ARP | DHCP Helper Address |
+===========+===========================+=================+================+=============+=======================+
| 10 | | PortChannel0001 | tagged | disabled | |
+-----------+---------------------------+-----------------+----------------+-------------+-----------------------+
| 100 | | PortChannel0001 | tagged | disabled | |
+-----------+---------------------------+-----------------+----------------+-------------+-----------------------+
| 637 | | PortChannel0001 | tagged | disabled | |
+-----------+---------------------------+-----------------+----------------+-------------+-----------------------+
| 815 | | PortChannel0001 | tagged | disabled | |
+-----------+---------------------------+-----------------+----------------+-------------+-----------------------+
| 900 | <ipv6>/64 | PortChannel0001 | tagged | disabled | |
+-----------+---------------------------+-----------------+----------------+-------------+-----------------------+
We have a direcly connected L3 device with a VLAN interface on the same IPv6 network:
- When we do a basic ICMPv6 test sourced from that L3 device, we observe that when packets are up to 9100 (the default MTU) it is fine;
- But if we have just 1 byte above, i.e., 9101, packets are fragmented as shown by the tcpdumpts collected at the SN2410 device:
Tcpdump with 9100 bytes ICMPv6 packets
17:00:50.342479 IP6 <L3-device-ip> > <SN2410-ip>: ICMP6, echo request, id 15788, seq 0, length 9060
17:00:50.342668 IP6 <SN2410-ip> > <L3-device-ip>: ICMP6, echo reply, id 15788, seq 0, length 9060
Tcpdump with 9101 bytes ICMPv6 packets
16:59:00.440470 IP6 <L3-device-ip> > <SN2410-ip>: ICMP6, echo request, id 15594, seq 0, length 9061
16:59:00.440634 IP6 <SN2410-ip> > <L3-device-ip>: frag (0|9048) ICMP6, echo reply, id 15594, seq 0, length 9048
16:59:00.440673 IP6 <SN2410-ip> > <L3-device-ip>: frag (9048|13)
The L3 device has MTU set accordingly and fragmentation isn’t seen when using the same L3 device with another switch not running SONiC and with MTU setup in the same way.
So it seems that despite MTU is set to 9216, the default value of 9100 is being considered.
Is the SN2410 limited to 9100 and perhaps SONiC isn’t rejecting a higher value when we set one?
Any other suggestions that may explain this behaviour?
Thank you in advance.