Slow packet forwarding between eqos and i350

I’m running jetson TX2 as a router to other jetsons and I have serious problems with peformance of network packet forwarding.

On any direct connection on my setups I can achieve 960 mbit/s without any problems, so each single
network connection is ok. Problem begins when I start using jetson tx2 as a network switch/router.

I have two setups:
laptop - (i350 = i350 on jetson tx2) - eqos on xavier
laptop - (i350 = eqos on jetson tx2) - eqos on xavier

First setup is faster, second doesn’t work at all.

I can achive 900 mbit on first setup when I put MTU on both links to 9000.
This is not useable because laptop simulates wide outer network where nobody will give me 9000,
there will be 1500.

When MTU is lower than 9000, both setups cannot give me even 900 mbit/s of throughput with
different symptoms.

Setup 1 (two i350 NIC on TX2).

When MTU is lower than 9000, CPU0 is used on 99% and speed goes down to 600 mbit/s.
I use iperf3 to benchmark with very simple settings:

laptop:

iperf3 -s

xavier:

iperf3 -c laptop

With this setup middle jetson tx2 shows that his CPU0 is busy on 100% (tegrastats and htop) if MTU is less than 9000.
It is impossible to change interrupts, all ways via /proc and /sys says about read/write error or something like this.

It seems to be impossible to change interrupt handling core =( Maybe I’m wrong and there is some way to bind
NIC to non-first core?

When MTU is 9000, CPU0 usage floats around 60-80% and forwarding traffic is about 900-950 mbit/s

Setup 2 (i350 and builtin eqos on TX2).

Traffic is floating from 40 to 270 mbit/s and very unstable:

[ 4] 3.00-4.00 sec 44.6 MBytes 374 Mbits/sec 1 402 KBytes
[ 4] 4.00-5.00 sec 29.8 MBytes 250 Mbits/sec 0 1.41 KBytes
[ 4] 5.00-6.00 sec 1.07 MBytes 8.97 Mbits/sec 1 489 KBytes
[ 4] 6.00-7.00 sec 20.7 MBytes 173 Mbits/sec 1 197 KBytes
[ 4] 7.00-8.00 sec 54.9 MBytes 460 Mbits/sec 0 349 KBytes
[ 4] 8.00-9.00 sec 33.1 MBytes 278 Mbits/sec 0 1.41 KBytes

CPU0 on Tx2 is used on 10-60% and its usage is very unstable. So this setup is not useable at all.
Looks like eqos us not stable.

Do I have any chance to achive 1 gbit/s? I was very surprised that in era of 400 gbit/s NIC we have problems with 1 gbit/s =((

Hi maxlapshin,

This usecase is not very common and we don’t verify it.

How did you set up to make eqos as a router/switch?

Well, it was very simple: echo 1 > /proc/sys/net/ipv4/ip_forwarding

and I have launched iperf3 from computer connected to eqos to computer connected to i350 output.

When I use eqos, speed is jumping from 20 to 300 mbit/s and CPU0 load jumps from 10 to 60%. Very unstable.

When I use 2 ports on i350, then on 400 mbit/s, core0 is almost busy handling interrupts. 500 mbit/s is 100% load, but it is pretty stable.

If it was possible to move interrupt handling from core 0

Hi,

We will try to reproduce this issue on devkit. However, we don’t have i350 but only NIC from other vendor. Is it okay?

Could you share the step by step method/commands? Also, is it needed to use NX on the other side? or we could use another host too?

Any other NIC will be ok. I’m 100% sure that any, even very cheap NIC that is 1 gbit compliant will be ok.

Direct laptop-eqos connection shows 970 mbit/s when I run iperf:

iperf3 -s on laptop

iperf3 -c laptop-ip on jetson works without any issues.

Problems starts when we enable packet forwarding.

Third computer maybe anyone: jetson, other laptop, powerful server. It doesn’t matter, because issues are only with intermediate jetson.

Seems that it is related to impossibility to move NIC interrupts to other core.

Hi,

Could you share the commands for the problematic case? I understand direct connection has no problem so we only need to check the problematic case.

I mean a list of steps from scratch including the packet forwarding.

Ok, laptop (interface enp4s0):

ip addr add 10.24.0.3/24 dev enp4s0
ip route add 10.25.0.0/24 via 10.24.0.4
iperf3 -s

jetson tx2. Interface eth1 looking to laptop, eth0 to xavier nx:

ip addr add 10.24.0.4/24 dev eth1
ip addr add 10.25.0.4/24 dev eth0
echo 1 > /proc/sys/net/ipv4/ip_forwarding
nvpmodel -q 2

xavier nx (eth0)

ip addr add add 10.25.0.3/24 dev eth0
ip route add 10.24.0.0/24 via 10.25.0.4
iperf3 -c 10.24.0.3

Hi maxlapshin,

We tried to reprodcue this issue with our NIC but no unstable one is seen.

Connecting to host 10.25.0.3, port 5201
[  4] local 10.24.0.3 port 55310 connected to 10.25.0.3 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec  70.3 MBytes   589 Mbits/sec    0    823 KBytes       
[  4]   1.00-2.00   sec  71.2 MBytes   598 Mbits/sec    0    901 KBytes       
[  4]   2.00-3.00   sec  71.2 MBytes   598 Mbits/sec    0    963 KBytes       
[  4]   3.00-4.00   sec  70.0 MBytes   587 Mbits/sec    0   1012 KBytes       
[  4]   4.00-5.00   sec  71.2 MBytes   598 Mbits/sec    0   1.04 MBytes       
[  4]   5.00-6.00   sec  71.2 MBytes   598 Mbits/sec    0   1.04 MBytes       
[  4]   6.00-7.00   sec  70.0 MBytes   587 Mbits/sec    0   1.04 MBytes       
[  4]   7.00-8.00   sec  71.2 MBytes   598 Mbits/sec    0   1.04 MBytes       
[  4]   8.00-9.00   sec  70.0 MBytes   587 Mbits/sec    0   1.04 MBytes       
[  4]   9.00-10.00  sec  71.2 MBytes   598 Mbits/sec    0   1.11 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -