硬件:两个 Thor devkit + T5000,10Gx4,用QSFP28直连
软件:L4T-R38.4.0
现象:iperf3打流,经常反复出现0 bps的情况,同时在连续0bps的时候,ping会出现答复包 积压突发 的情况。请见截图中方框部分。
问题:这个情况如何解决?
硬件:两个 Thor devkit + T5000,10Gx4,用QSFP28直连
软件:L4T-R38.4.0
现象:iperf3打流,经常反复出现0 bps的情况,同时在连续0bps的时候,ping会出现答复包 积压突发 的情况。请见截图中方框部分。
问题:这个情况如何解决?
Try parallelism:
iperf3 -c <SERVER_IP> -P 16 -t 30 -O 5
Bidirectional test:
iperf3 -c <SERVER_IP> -P 8 -t 30 -O 5 --bidir
-P, --parallel # number of parallel client streams to run
-t 30 runs 30 seconds.
-O 5 ignores the first 5 seconds (warmup).
Another thing that might matter; what is your MTU and is it identical on all ethernet devices?
ip link show mgbe0_0
ip link show mgbe1_0
ip link show mgbe2_0
ip link show mgbe3_0
If you are using bonding.ko:
cat /proc/net/bonding/bond0
ip link show bond0
If your actual MTU is not == to your desired MTU settings, Do you need macsec? Macsec reduces mtu by 34bytes.
nvidia-oot/drivers/net/ethernet/nvidia/nvethernet/macsec.h
/**
* @brief MACSEC SECTAG + ICV + 2B
ethertype adds up to 34B
*/
#define MACSEC_TAG_ICV_LEN 34U
If you don’t need macsec you could try:
sudo tee /etc/modprobe.d/nvethernet.conf <<'EOF'
options nvethernet macsec_enable=0
EOF
Reboot. Then run the ip link show block again to see if MTU actual == desired.
I don’t know. These might help.
Disable Hardware Offloads:
sudo ethtool -K mgbe0_0 tso off gso off gro off lro off
Run your iperf3 test again. If the 0 bps drops disappear, you’ve found the culprit. You can then reenable them one by one to see which specific offload engine is failing.
At high speeds, the default DMA ring buffer sizes might be too small. If the CPU is momentarily busy and doesn’t service the network interrupt fast enough, the ring buffer fills up, packets are tail-dropped, and TCP stalls out.
First, check your current and maximum ring buffer sizes:
ethtool -g mgbe0_0
On my Thor it shows preset maximums: RX: 16384. So I could try up to 16384 to see if it helps 0bps
for i in {0..3}; do sudo ethtool -G mgbe${i}_0 rx 16384; done
Interrupt coalescing delays hardware interrupts so the CPU can process packets in batches. If the delay is too long, latency spikes; if it’s too short, the CPU is overwhelmed with IRQs. Check the current settings:
ethtool -c mgbe0_0
Try modifying the rx-usecs (the time to wait before generating an interrupt) to a slightly higher or lower value to see if it stabilizes the TCP stream. A good starting test is disabling adaptive RX and setting a fixed microsecond delay:
sudo ethtool -C mgbe0_0 adaptive-rx off rx-usecs 50
Try setting Thor to maximum performance profile:
sudo nvpmodel -m 0
sudo jetson_clocks
I don’t know if this could help, but you could try enabling pause frames in devicetree, as shown in the post.
https://forums.developer.nvidia.com/t/jetson-thor-mgbe-tx-pause-frames/360491/5
谢谢,我研究一下。
以下是我这边的ethtool的-g和-c的数据,pre-set RX也是16384,但“Current hardware settings”的RX确只有4096,不知道这是什么情况?Adaptive RX: n/a TX: n/a看起来都没有使能。网络这些参数并没有专门修改过,都是从源码中直接编译出来的。
$ sudo ethtool -g mgbe0_0
[sudo] password for ictrek:
Ring parameters for mgbe0_0:
Pre-set maximums:
RX: 16384
RX Mini: n/a
RX Jumbo: n/a
TX: 4096
TX push buff len: n/a
Current hardware settings:
RX: 4096
RX Mini: n/a
RX Jumbo: n/a
TX: 4096
RX Buf Len: n/a
CQE Size: n/a
TX Push: off
RX Push: off
TX push buff len: n/a
TCP data split: n/a
$ sudo ethtool -c mgbe0_0
Coalesce parameters for mgbe0_0:
Adaptive RX: n/a TX: n/a
stats-block-usecs: n/a
sample-interval: n/a
pkt-rate-low: n/a
pkt-rate-high: n/a
rx-usecs: 512
rx-frames: 64
rx-usecs-irq: n/a
rx-frames-irq: n/a
tx-usecs: 256
tx-frames: 16
tx-usecs-irq: n/a
tx-frames-irq: n/a
rx-usecs-low: n/a
rx-frame-low: n/a
tx-usecs-low: n/a
tx-frame-low: n/a
rx-usecs-high: n/a
rx-frame-high: n/a
tx-usecs-high: n/a
tx-frame-high: n/a
CQE mode RX: n/a TX: n/a
tx-aggr-max-bytes: n/a
tx-aggr-max-frames: n/a
tx-aggr-time-usecs n/a