Reduced bandwidth at 10Gbps on Orin and mgbe_payload_cs_err correlation

waterbear · August 31, 2023, 2:19pm

I’m encountering network bandwidth issues with my 64GB Orin dev kit connected to a high-end desktop. My end goal is to be able to transfer data via the 10Gbps link from the dev kit to the desktop via NFS. I’m noticing that I’m somehow inducing conditions that reduce overall throughput and seem to correlate to when I see the “mgbe_payload_cs_err” counter increment, as reported by ethtool -S eth0. What conditions in the Orin’s ethernet controller would cause this to happen?

The code in mgbe_core.c (nv-tegra.nvidia Code Review - kernel/nvethernetrm.git/blob - osi/core/mgbe_core.c) increments this counter when it evaluates ((tx_errors & MGBE_MAC_TX_PCE) == MGBE_MAC_TX_PCE) to be true. Where can I find out more information about the MGBE and its registers to understand better what might be happening? I’ve downloaded the TRM but that is pretty scant with MGBE info.

Below are details about my testing and what I’m seeing.

My test setup is a 64GB Orin dev kit that is directly connected (no intervening switch) via a CAT6A cable to an AMD-based desktop. The desktop is equipped with a Chelsio T540 adapter card which provides 4x10GbE ports. Both the Orin and the desktop adapters are configured with jumbo frames (MTU=9000). I’m running the latest JP 5.1.2 with a slightly tweaked kernel to get a full MTU of 9000 (see RGMII issue with Jetpack 5.1.1 (KSZ9131) - #7 by waterbear).

Further, the Orin is configured for maximum performance. Here’s the output of sudo jetson_clocks --show:

SOC family:tegra234  Machine:Jetson AGX Orin Developer Kit
Online CPUs: 0-11
cpu0:  Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu1:  Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu2:  Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu3:  Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu4:  Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu5:  Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu6:  Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu7:  Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu8:  Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu9:  Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu10: Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu11: Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
GPU MinFreq=1300500000 MaxFreq=1300500000 CurrentFreq=1300500000
EMC MinFreq=204000000 MaxFreq=3199000000 CurrentFreq=3199000000 FreqOverride=1
DLA0_CORE:   Online=1 MinFreq=0 MaxFreq=1600000000 CurrentFreq=1600000000
DLA0_FALCON: Online=1 MinFreq=0 MaxFreq=844800000 CurrentFreq=844800000
DLA1_CORE:   Online=1 MinFreq=0 MaxFreq=1600000000 CurrentFreq=1600000000
DLA1_FALCON: Online=1 MinFreq=0 MaxFreq=844800000 CurrentFreq=844800000
PVA0_VPS0: Online=1 MinFreq=0 MaxFreq=1152000000 CurrentFreq=1152000000
PVA0_AXI:  Online=1 MinFreq=0 MaxFreq=832000000 CurrentFreq=832000000
FAN Dynamic Speed control=active hwmon4_pwm1=54
NV Power Mode: MAXN

The Chelsio adapter on the desktop can either be in standard “NIC” mode or “TCP-Offload Engine” (TOE) mode where TCP operations are offloaded to hardware. In NIC mode, I get the speed I expect although I do see worse performance when running iperf3 in zerocopy mode on the Orin which I can’t explain. I have also observed on occasion retransmits reported by iperf3, and while they are uncommon, do coincide with mgbe_payload_cs_err incrementing.

iperf3 standard mode:

~$ iperf3 --client 10.0.3.1
Connecting to host 10.0.3.1, port 5201
[  5] local 10.0.3.10 port 52314 connected to 10.0.3.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  1.15 GBytes  9.89 Gbits/sec    0   1.66 MBytes
[  5]   1.00-2.00   sec  1.15 GBytes  9.91 Gbits/sec    0   1.66 MBytes
[  5]   2.00-3.00   sec  1.15 GBytes  9.91 Gbits/sec    0   1.66 MBytes
[  5]   3.00-4.00   sec  1.15 GBytes  9.91 Gbits/sec    0   1.66 MBytes
[  5]   4.00-5.00   sec  1.15 GBytes  9.91 Gbits/sec    0   1.66 MBytes
[  5]   5.00-6.00   sec  1.15 GBytes  9.91 Gbits/sec    0   1.66 MBytes
[  5]   6.00-7.00   sec  1.15 GBytes  9.92 Gbits/sec    0   1.66 MBytes
[  5]   7.00-8.00   sec  1.15 GBytes  9.90 Gbits/sec    0   1.66 MBytes
[  5]   8.00-9.00   sec  1.15 GBytes  9.92 Gbits/sec    0   1.66 MBytes
[  5]   9.00-10.00  sec  1.15 GBytes  9.91 Gbits/sec    0   1.66 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  11.5 GBytes  9.91 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  11.5 GBytes  9.91 Gbits/sec                  receiver

iperf Done.

iperf3 zerocopy mode:

~$ iperf3 --client 10.0.3.1 -Z
Connecting to host 10.0.3.1, port 5201
[  5] local 10.0.3.10 port 46158 connected to 10.0.3.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   935 MBytes  7.84 Gbits/sec    0   2.31 MBytes
[  5]   1.00-2.00   sec   938 MBytes  7.86 Gbits/sec    0   2.31 MBytes
[  5]   2.00-3.00   sec   936 MBytes  7.85 Gbits/sec    0   2.31 MBytes
[  5]   3.00-4.00   sec   939 MBytes  7.87 Gbits/sec    0   2.31 MBytes
[  5]   4.00-5.00   sec   940 MBytes  7.89 Gbits/sec    0   2.31 MBytes
[  5]   5.00-6.00   sec   936 MBytes  7.85 Gbits/sec    0   2.31 MBytes
[  5]   6.00-7.00   sec   939 MBytes  7.87 Gbits/sec    0   2.31 MBytes
[  5]   7.00-8.00   sec   934 MBytes  7.83 Gbits/sec    0   2.31 MBytes
[  5]   8.00-9.00   sec   938 MBytes  7.86 Gbits/sec    0   2.54 MBytes
[  5]   9.00-10.00  sec   939 MBytes  7.87 Gbits/sec    0   2.54 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  9.15 GBytes  7.86 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  9.15 GBytes  7.86 Gbits/sec                  receiver

iperf Done.

Any ideas why zerocopy mode performs worse for the Orin here? I’ve got another non-Orin device connected to one of the other Chelsio ports and it operates at full speed regardless of the zerocopy configuration.

When I enable TOE mode on the Chelsio adapter, iperf3 in standard mode continues to operate more or less as expected (retransmits occur but are uncommon). When I enable zero-copy, iperf3 reports a huge number of TCP retransmits. The biggest correlation is that mgbe_payload_cs_err consistently increments when this happens.

iperf3 standard mode w/TOE:

~$ iperf3 --client 10.0.3.1
Connecting to host 10.0.3.1, port 5201
[  5] local 10.0.3.10 port 39076 connected to 10.0.3.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  1.15 GBytes  9.87 Gbits/sec    0   1.04 MBytes
[  5]   1.00-2.00   sec  1.15 GBytes  9.91 Gbits/sec    0   1.04 MBytes
[  5]   2.00-3.00   sec  1.15 GBytes  9.92 Gbits/sec    0   1.04 MBytes
[  5]   3.00-4.00   sec  1.15 GBytes  9.90 Gbits/sec    0   1.04 MBytes
[  5]   4.00-5.00   sec  1.15 GBytes  9.92 Gbits/sec    0   1.04 MBytes
[  5]   5.00-6.00   sec  1.15 GBytes  9.91 Gbits/sec    0   1.04 MBytes
[  5]   6.00-7.00   sec  1.15 GBytes  9.91 Gbits/sec    0   1.04 MBytes
[  5]   7.00-8.00   sec  1.15 GBytes  9.91 Gbits/sec    0   1.04 MBytes
[  5]   8.00-9.00   sec  1.15 GBytes  9.91 Gbits/sec    0   1.04 MBytes
[  5]   9.00-10.00  sec  1.15 GBytes  9.91 Gbits/sec    0   1.04 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  11.5 GBytes  9.91 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  11.5 GBytes  9.90 Gbits/sec                  receiver

iperf Done.

iperf3 zercopy mode w/TOE:

~$ iperf3 --client 10.0.3.1 -Z
Connecting to host 10.0.3.1, port 5201
[  5] local 10.0.3.10 port 47698 connected to 10.0.3.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   540 MBytes  4.53 Gbits/sec  118   8.75 KBytes
[  5]   1.00-2.00   sec   949 MBytes  7.96 Gbits/sec  114   1.04 MBytes
[  5]   2.00-3.00   sec   744 MBytes  6.24 Gbits/sec  116   1.04 MBytes
[  5]   3.00-4.00   sec   515 MBytes  4.32 Gbits/sec  232   1.03 MBytes
[  5]   4.00-5.00   sec   729 MBytes  6.11 Gbits/sec  117   1.03 MBytes
[  5]   5.00-6.00   sec   525 MBytes  4.40 Gbits/sec  231    788 KBytes
[  5]   6.00-7.00   sec   740 MBytes  6.21 Gbits/sec  116   1.02 MBytes
[  5]   7.00-8.00   sec   651 MBytes  5.46 Gbits/sec  117   1.04 MBytes
[  5]   8.00-9.00   sec   376 MBytes  3.16 Gbits/sec  347   1.06 MBytes
[  5]   9.00-10.00  sec   819 MBytes  6.87 Gbits/sec    1   1.06 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  6.43 GBytes  5.53 Gbits/sec  1509             sender
[  5]   0.00-10.00  sec  6.43 GBytes  5.52 Gbits/sec                  receiver

iperf Done.

In the first test above, viewing mgbe_payload_cs_err via ethtool -S eth0 showed that it did not increment. In the latter case, it incremented by 14 counts. Those 14 counts seem to correlate with the huge amount of retransmitted packets reported.

I’d be inclined to think the issue is with the Chelsio adapter and its TOE mode, but the other non-Orin test device I have does not exhibit this behavior at all. It suffers little to no retransmits in either NIC or TOE mode and does not exhibit any reduced performance when zerocopy is enabled.

The long and short of all of this is that when I have NFS enabled and the Chelsio card configured for TOE mode, the throughput from the Orin to the desktop gets greatly reduced and is not consistent - I see transfer rates ranging from 500MB/s to 800MB/s for a 60s test run. The other test device in the system does not exhibit any of these issues – it reports a transfer rate of ~1000MB/s regardless of the Chelsio mode. This leads me to believe the issues is with the Orin’s ethernet controller.

Any ideas? Circling back to my original question above – is there additional documentation available detailing the MGBE operation?

waterbear · August 31, 2023, 3:59pm

FWIW, I inserted a X540-10G-1T-X8 10GbE adapter into the Orin dev kit’s PCIe slot and ran the iperf3 tests against the TOE mode of the Chelsio adapter. Results:

iperf3 in standard mode w/TOE:

~$ iperf3 --client 10.0.3.1
Connecting to host 10.0.3.1, port 5201
[  5] local 10.0.3.11 port 39974 connected to 10.0.3.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  1.16 GBytes  9.94 Gbits/sec    0   1015 KBytes       
[  5]   1.00-2.00   sec  1.15 GBytes  9.91 Gbits/sec    0   1015 KBytes       
[  5]   2.00-3.00   sec  1.15 GBytes  9.91 Gbits/sec    0   1015 KBytes       
[  5]   3.00-4.00   sec  1.15 GBytes  9.91 Gbits/sec    0   1015 KBytes       
[  5]   4.00-5.00   sec  1.15 GBytes  9.91 Gbits/sec    0   1015 KBytes       
[  5]   5.00-6.00   sec  1.15 GBytes  9.91 Gbits/sec    0   1015 KBytes       
[  5]   6.00-7.00   sec  1.15 GBytes  9.91 Gbits/sec    0   1015 KBytes       
[  5]   7.00-8.00   sec  1.15 GBytes  9.91 Gbits/sec    0   1015 KBytes       
[  5]   8.00-9.00   sec  1.16 GBytes  9.92 Gbits/sec    0   1015 KBytes       
[  5]   9.00-10.00  sec  1.15 GBytes  9.91 Gbits/sec    0   1015 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  11.5 GBytes  9.92 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  11.5 GBytes  9.91 Gbits/sec                  receiver

iperf Done.

iperf3 in zerocopy mode w/TOE:

~$ iperf3 --client 10.0.3.1 -Z
Connecting to host 10.0.3.1, port 5201
[  5] local 10.0.3.11 port 60404 connected to 10.0.3.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  1.00 GBytes  8.61 Gbits/sec    0   1.04 MBytes       
[  5]   1.00-2.00   sec  1022 MBytes  8.58 Gbits/sec    0   1.04 MBytes       
[  5]   2.00-3.00   sec  1022 MBytes  8.58 Gbits/sec    0   1.04 MBytes       
[  5]   3.00-4.00   sec  1.00 GBytes  8.60 Gbits/sec    0   1.04 MBytes       
[  5]   4.00-5.00   sec  1.00 GBytes  8.60 Gbits/sec    0   1.04 MBytes       
[  5]   5.00-6.00   sec  1024 MBytes  8.59 Gbits/sec    0   1.04 MBytes       
[  5]   6.00-7.00   sec  1.00 GBytes  8.61 Gbits/sec    0   1.04 MBytes       
[  5]   7.00-8.00   sec  1022 MBytes  8.58 Gbits/sec    0   1.04 MBytes       
[  5]   8.00-9.00   sec  1.00 GBytes  8.60 Gbits/sec    0   1.04 MBytes       
[  5]   9.00-10.00  sec  1024 MBytes  8.59 Gbits/sec    0   1.04 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  10.0 GBytes  8.59 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  10.0 GBytes  8.59 Gbits/sec                  receiver

iperf Done.

No retransmits encountered. All this seems to point to an issue with the MGBE hardware and/or drivers.

waterbear · September 5, 2023, 4:27pm

I’ve noticed that I can get the problem to go away if I disable tx-checksum-ipv4. The side effect of doing this is decreased throughput. Here are the steps that I follow:

Reboot the Orin dev kit
Reboot the desktop
Leave the desktop’s Chelsio adapter in the default “NIC” mode
Configure the Chelsio adapter with an MTU 9000 on the port connected to the Orin dev kit
Launch iperf3 --server -4 --bind 10.0.3.1 on the server (IP address is specific to the adapter port connected to the Orin dev kit)
Execute sudo jetson_clocks on the Orin
Configure eth0 on the Orin with an MTU of 9000
Check mgbe_payload_cs_err before and after running iperf3, noting that they match the # of retransmits detected:

orin@orin64-alpha:~/projects/romulan/scripts$ ethtool -S eth0 | grep payload
     payload_cs_error: 0
     mgbe_payload_cs_err: 0
orin@orin64-alpha:~/projects/romulan/scripts$ iperf3 --client 10.0.3.1
Connecting to host 10.0.3.1, port 5201
[  5] local 10.0.3.10 port 36156 connected to 10.0.3.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  1.15 GBytes  9.84 Gbits/sec    0   2.02 MBytes
[  5]   1.00-2.00   sec  1.15 GBytes  9.90 Gbits/sec    2   1.79 MBytes
[  5]   2.00-3.00   sec  1.15 GBytes  9.90 Gbits/sec    2   1.82 MBytes
[  5]   3.00-4.00   sec  1.15 GBytes  9.90 Gbits/sec    5   1.57 MBytes
[  5]   4.00-5.00   sec  1.15 GBytes  9.89 Gbits/sec    4   1.82 MBytes
[  5]   5.00-6.00   sec  1.15 GBytes  9.90 Gbits/sec    3   1.36 MBytes
[  5]   6.00-7.00   sec  1.15 GBytes  9.90 Gbits/sec    3   1.65 MBytes
[  5]   7.00-8.00   sec  1.15 GBytes  9.90 Gbits/sec    5   1.70 MBytes
[  5]   8.00-9.00   sec  1.15 GBytes  9.90 Gbits/sec    2   1.72 MBytes
[  5]   9.00-10.00  sec  1.15 GBytes  9.90 Gbits/sec    1   1.46 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  11.5 GBytes  9.89 Gbits/sec   27             sender
[  5]   0.00-10.00  sec  11.5 GBytes  9.89 Gbits/sec                  receiver

iperf Done.
orin@orin64-alpha:~/projects/romulan/scripts$ ethtool -S eth0 | grep payload
     payload_cs_error: 0
     mgbe_payload_cs_err: 27

Disable tx-checksum-ipv4 via sudo ethtool -K eth0 tx-checksum-ipv4 off
Rerun the iperf3 client test and observe no retransmits / mgbe_payload_cs_err occur, but things then run at a much lower bitrate:

orin@orin64-alpha:~/projects/romulan/scripts$ sudo ethtool -K eth0 tx-checksum-ipv4 off
Actual changes:
tx-checksum-ipv4: off
tcp-segmentation-offload: off
        tx-tcp-segmentation: off [requested on]
orin@orin64-alpha:~/projects/romulan/scripts$ ethtool -S eth0 | grep payload
     payload_cs_error: 0
     mgbe_payload_cs_err: 27
orin@orin64-alpha:~/projects/romulan/scripts$ iperf3 --client 10.0.3.1
Connecting to host 10.0.3.1, port 5201
[  5] local 10.0.3.10 port 60198 connected to 10.0.3.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   727 MBytes  6.10 Gbits/sec    0   1.08 MBytes
[  5]   1.00-2.00   sec   729 MBytes  6.11 Gbits/sec    0   1.08 MBytes
[  5]   2.00-3.00   sec   729 MBytes  6.11 Gbits/sec    0   1.13 MBytes
[  5]   3.00-4.00   sec   730 MBytes  6.12 Gbits/sec    0   1.13 MBytes
[  5]   4.00-5.00   sec   729 MBytes  6.11 Gbits/sec    0   1.13 MBytes
[  5]   5.00-6.00   sec   730 MBytes  6.12 Gbits/sec    0   1.13 MBytes
[  5]   6.00-7.00   sec   729 MBytes  6.11 Gbits/sec    0   1.13 MBytes
[  5]   7.00-8.00   sec   730 MBytes  6.12 Gbits/sec    0   1.13 MBytes
[  5]   8.00-9.00   sec   730 MBytes  6.12 Gbits/sec    0   1.84 MBytes
[  5]   9.00-10.00  sec   730 MBytes  6.12 Gbits/sec    0   1.84 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  7.12 GBytes  6.12 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  7.12 GBytes  6.11 Gbits/sec                  receiver

iperf Done.
orin@orin64-alpha:~/projects/romulan/scripts$ ethtool -S eth0 | grep payload
     payload_cs_error: 0
     mgbe_payload_cs_err: 27

waterbear · September 6, 2023, 2:18pm

It seems that the patch I created to allow a packet size of 9000 (see RGMII issue with Jetpack 5.1.1 (KSZ9131) - #7 by waterbear) is inducing the issue. I decided to try different MTU sizes to see what effects those had (e.g. 1500 → 3000 → 6000 → 9000). At 9000 the troubles appeared. I backed things off to 8966 and don’t see retransmit / mgbe_payload_cs_errs. Here’s an example run:

orin@orin64-alpha:~/projects/romulan/scripts$ sudo ./update-mtu.sh eth0 9000
orin@orin64-alpha:~/projects/romulan/scripts$ ethtool -S eth0 | grep payload
     payload_cs_error: 0
     mgbe_payload_cs_err: 610
orin@orin64-alpha:~/projects/romulan/scripts$ iperf3 -Z --client 10.0.3.1 -t 5
Connecting to host 10.0.3.1, port 5201
[  5] local 10.0.3.10 port 39528 connected to 10.0.3.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   509 MBytes  4.27 Gbits/sec  232   1.02 MBytes
[  5]   1.00-2.00   sec   646 MBytes  5.42 Gbits/sec  117   1.04 MBytes
[  5]   2.00-3.00   sec   584 MBytes  4.90 Gbits/sec  231   1.05 MBytes
[  5]   3.00-4.00   sec   952 MBytes  7.99 Gbits/sec    0   1.05 MBytes
[  5]   4.00-5.00   sec   952 MBytes  7.99 Gbits/sec    0   1.05 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-5.00   sec  3.56 GBytes  6.11 Gbits/sec  580             sender
[  5]   0.00-5.00   sec  3.56 GBytes  6.11 Gbits/sec                  receiver

iperf Done.
orin@orin64-alpha:~/projects/romulan/scripts$ ethtool -S eth0 | grep payload
     payload_cs_error: 0
     mgbe_payload_cs_err: 615
orin@orin64-alpha:~/projects/romulan/scripts$ sudo ./update-mtu.sh eth0 8966
orin@orin64-alpha:~/projects/romulan/scripts$ ethtool -S eth0 | grep payload
     payload_cs_error: 0
     mgbe_payload_cs_err: 615
orin@orin64-alpha:~/projects/romulan/scripts$ iperf3 -Z --client 10.0.3.1 -t 5
Connecting to host 10.0.3.1, port 5201
[  5] local 10.0.3.10 port 56760 connected to 10.0.3.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   935 MBytes  7.84 Gbits/sec    0   1.03 MBytes
[  5]   1.00-2.00   sec   936 MBytes  7.85 Gbits/sec    0   1.03 MBytes
[  5]   2.00-3.00   sec   936 MBytes  7.85 Gbits/sec    0   1.03 MBytes
[  5]   3.00-4.00   sec   936 MBytes  7.85 Gbits/sec    0   1.03 MBytes
[  5]   4.00-5.00   sec   938 MBytes  7.86 Gbits/sec    0   1.03 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-5.00   sec  4.57 GBytes  7.85 Gbits/sec    0             sender
[  5]   0.00-5.00   sec  4.57 GBytes  7.85 Gbits/sec                  receiver

iperf Done.
orin@orin64-alpha:~/projects/romulan/scripts$ ethtool -S eth0 | grep payload
     payload_cs_error: 0
     mgbe_payload_cs_err: 615

I can’t update the other thread since that’s locked, so hopefully anyone stumbling on that will see this one as well.

I’m still not sure what’s up and assume something funky is happening in the MGBE controller when a packet size of 9000 is defined.

DaneLLL · September 13, 2023, 3:50am

Hi,
Please try this setup:
The maximum speed of the 10G network port of the ORIN-32GB module can only reach 6.5G using iperf3 - #4 by DaneLLL

The rootfs wirh Ubuntu desktop may have impact to throughput. Please try minimum rootfs.

waterbear · September 13, 2023, 11:42am

I don’t see how this would address what I’m running into. I think zero-copy is a different issue altogether. I can achieve 9.9G not using zero-copy.

A problem gets introduced if one “fixes” the mac-sec issue in the kernel and can actually set the MTU to 9000. The MGBE starts encountering mgbe_payload_cs_err related errors. It’s best at this point for the state of the kernel to leave the MTU 8966 limit in place and simply maximize the clocks via sudo jetson_clocks.

system · October 9, 2023, 8:49am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Package loss on bandwidth reduced network - AGX Orin Jetson AGX Orin ethernet , networking	20	1915	April 11, 2023
Jetson Orin NX, custom board, add M.2 on pcie2, pcieport error Jetson Orin NX pcie , board-design	2	299	April 23, 2024
How to improve Orin module memory bandwidth Jetson AGX Orin performance	21	577	May 23, 2024
ORIN 32GB USB3 hard drives disconnect Jetson AGX Orin ubuntu , board-design	13	142	July 17, 2024
Total USB 3.2 Bandwidth on Orin AGX DevKit Jetson AGX Orin usb , board-design	9	155	September 11, 2024
Question for 10G (MGBE) Not working Jetson AGX Orin board-design , ethernet	60	1080	August 13, 2024
High CPU usage on Jetson TX2 with GigE fully loaded Jetson TX2 hw , kernel , performance	12	2529	October 18, 2021
Orin boot error Jetson AGX Orin boot	16	2711	May 3, 2023
Nvethernet PTP bug Jetson AGX Orin kernel , time-synchronization	28	2925	January 16, 2024
Jetson orin and mcp2518fd can module Jetson AGX Orin spi , can-bus	27	2659	August 22, 2023

Reduced bandwidth at 10Gbps on Orin and mgbe_payload_cs_err correlation

Related topics