The ethernet interface over PCIe(endpoint mode) not work

Hi @hermes_wu

Yes, I noticed this issue. Will check what is going on.

BTW, could you help me check the functionality of jp5.0.2 and jp5.0.2? I remember this was working on my side in previous setup (NM disabled/netplan setup).

Hi @WayneWWW

I have checked the functionality of Jetpack 5.0.2 and the result is the same as Jetpack 5.1.

Hi @WayneWWW

I tried using Wireshark to capture Ethernet packets on the EP side and found that the Ethernet driver received the packets sent from the RC side. But it seems no response on the network layer(No response for ARP). Do you have any idea about this situation? The relation information is attached below.

EP : Jetpack 5.1
RC : Jetpack 5.1

EP side test:

root@tegra-ubuntu:/sys/kernel/config/pci_ep# ifconfig eth2
eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 64512
        inet 10.10.10.2  netmask 255.255.255.0  broadcast 10.10.10.255
        inet6 fe80::860:6ff:fe52:ec10  prefixlen 64  scopeid 0x20<link>
        ether 0a:60:06:52:ec:10  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

root@tegra-ubuntu:/sys/kernel/config/pci_ep# ping 10.10.10.1
PING 10.10.10.1 (10.10.10.1) 56(84) bytes of data.
From 10.10.10.2 icmp_seq=1 Destination Host Unreachable
From 10.10.10.2 icmp_seq=2 Destination Host Unreachable
From 10.10.10.2 icmp_seq=5 Destination Host Unreachable
From 10.10.10.2 icmp_seq=6 Destination Host Unreachable
^C
--- 10.10.10.1 ping statistics ---
7 packets transmitted, 0 received, +4 errors, 100% packet loss, time 6145ms
pipe 4
root@tegra-ubuntu:/sys/kernel/config/pci_ep# 

RC side test:

root@tegra-ubuntu:/home/denso# ifconfig eth0
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 64512
        inet 10.10.10.1  netmask 255.255.255.0  broadcast 10.10.10.255
        inet6 fe80::e060:68ff:fea3:3a83  prefixlen 64  scopeid 0x20<link>
        ether e2:60:68:a3:3a:83  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

root@tegra-ubuntu:/home/denso# ping 10.10.10.2
PING 10.10.10.2 (10.10.10.2) 56(84) bytes of data.
From 10.10.10.1 icmp_seq=1 Destination Host Unreachable
From 10.10.10.1 icmp_seq=2 Destination Host Unreachable
From 10.10.10.1 icmp_seq=3 Destination Host Unreachable
From 10.10.10.1 icmp_seq=4 Destination Host Unreachable
From 10.10.10.1 icmp_seq=5 Destination Host Unreachable
From 10.10.10.1 icmp_seq=6 Destination Host Unreachable
^C
--- 10.10.10.2 ping statistics ---
7 packets transmitted, 0 received, +6 errors, 100% packet loss, time 6144ms
pipe 4
root@tegra-ubuntu:/home/denso#

Wireshark snapshot:

Hi @WayneWWW

I borrowed a AGX Orin dev kit and replaced the AGX Xavier RC side with it. Then I have a comparison test, you can refer the test set-2 in the test cases below.

Summary:

  1. The ethernet communication between EP and RC is workable in test set-2
  2. The bitrate from RC to EP is less than 4Gbits/s
  3. The bitrate from EP to RC is less than 7Gbits/s

Question:

  1. Why the result is different between AGX Xavier and AGX Orin with the same JP 5.1?
  2. The throughput seems not to meet PCIe Gen.3 x8(Less than 10%), even though I set the power mode to MAXN.

Test Cases:
PCIe_InterCon_Compare

EP Side test log:

root@tegra-ubuntu:/sys/kernel/config/pci_ep# ifconfig eth1
eth1: flags=4098<BROADCAST,MULTICAST>  mtu 64512
        ether fa:a6:fa:3a:19:bf  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

root@tegra-ubuntu:/sys/kernel/config/pci_ep# ifconfig eth1 up
root@tegra-ubuntu:/sys/kernel/config/pci_ep# [  333.694349] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  333.694576] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  333.831660] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  333.831938] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  333.866746] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  333.867004] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  334.370709] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  334.370985] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  334.594560] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  334.594821] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!

root@tegra-ubuntu:/sys/kernel/config/pci_ep# ifconfig eth1
eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 64512
        inet 10.10.10.2  netmask 255.255.255.0  broadcast 10.10.10.255
        inet6 fe80::f8a6:faff:fe3a:19bf  prefixlen 64  scopeid 0x20<link>
        ether fa:a6:fa:3a:19:bf  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

root@tegra-ubuntu:/sys/kernel/config/pci_ep# ping 10.10.10.1
PING 10.10.10.1 (10.10.10.1) 56(84) bytes of data.
64 bytes from 10.10.10.1: icmp_seq=1 ttl=64 time=0.597 ms
64 bytes from 10.10.10.1: icmp_seq=2 ttl=64 time=1.27 ms
64 bytes from 10.10.10.1: icmp_seq=3 ttl=64 time=1.27 ms
64 bytes from 10.10.10.1: icmp_seq=4 ttl=64 time=1.21 ms
^C
--- 10.10.10.1 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3005ms
rtt min/avg/max/mdev = 0.597/1.085/1.272/0.283 ms
root@tegra-ubuntu:/sys/kernel/config/pci_ep# iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 10.10.10.1, port 47622
[  5] local 10.10.10.2 port 5201 connected to 10.10.10.1 port 47624
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   418 MBytes  3.51 Gbits/sec                  
[  5]   1.00-2.00   sec   414 MBytes  3.47 Gbits/sec                  
[  5]   2.00-3.00   sec   412 MBytes  3.45 Gbits/sec                  
[  5]   3.00-4.00   sec   409 MBytes  3.43 Gbits/sec                  
[  5]   4.00-5.00   sec   408 MBytes  3.43 Gbits/sec                  
[  5]   5.00-6.00   sec   408 MBytes  3.42 Gbits/sec                  
[  5]   6.00-7.00   sec   411 MBytes  3.45 Gbits/sec                  
[  5]   7.00-8.00   sec   407 MBytes  3.41 Gbits/sec                  
[  5]   8.00-9.00   sec   406 MBytes  3.41 Gbits/sec                  
[  5]   9.00-10.00  sec   406 MBytes  3.41 Gbits/sec                  
[  5]  10.00-10.00  sec   831 KBytes  3.30 Gbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.00  sec  4.00 GBytes  3.44 Gbits/sec                  receiver
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
^Ciperf3: interrupt - the server has terminated
root@tegra-ubuntu:/sys/kernel/config/pci_ep# iperf3 -c 10.10.10.1
Connecting to host 10.10.10.1, port 5201
[  5] local 10.10.10.2 port 60002 connected to 10.10.10.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   782 MBytes  6.56 Gbits/sec    0   3.07 MBytes       
[  5]   1.00-2.00   sec   742 MBytes  6.23 Gbits/sec    0   3.07 MBytes       
[  5]   2.00-3.00   sec   739 MBytes  6.19 Gbits/sec    0   3.07 MBytes       
[  5]   3.00-4.00   sec   750 MBytes  6.30 Gbits/sec    0   3.07 MBytes       
[  5]   4.00-5.00   sec   756 MBytes  6.34 Gbits/sec    0   3.07 MBytes       
[  5]   5.00-6.00   sec   780 MBytes  6.54 Gbits/sec    0   3.07 MBytes       
[  5]   6.00-7.00   sec   791 MBytes  6.64 Gbits/sec    0   3.07 MBytes       
[  5]   7.00-8.00   sec   798 MBytes  6.69 Gbits/sec    0   3.07 MBytes       
[  5]   8.00-9.00   sec   784 MBytes  6.58 Gbits/sec    0   3.07 MBytes       
[  5]   9.00-10.00  sec   772 MBytes  6.48 Gbits/sec    0   3.07 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  7.51 GBytes  6.45 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  7.51 GBytes  6.45 Gbits/sec                  receiver

iperf Done.
root@tegra-ubuntu:/sys/kernel/config/pci_ep# 

RC Side test log:

root@tegra-ubuntu:/home/denso# ifconfig eth0
eth0: flags=4098<BROADCAST,MULTICAST>  mtu 64512
        ether f6:a1:f9:f6:21:a6  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

root@tegra-ubuntu:/home/denso# ifconfig eth0 up
root@tegra-ubuntu:/home/denso# ifconfig eth0
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 64512
        inet 10.10.10.1  netmask 255.255.255.0  broadcast 10.10.10.255
        inet6 fe80::f4a1:f9ff:fef6:21a6  prefixlen 64  scopeid 0x20<link>
        ether f6:a1:f9:f6:21:a6  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

root@tegra-ubuntu:/home/denso# ping 10.10.10.2
PING 10.10.10.2 (10.10.10.2) 56(84) bytes of data.
64 bytes from 10.10.10.2: icmp_seq=1 ttl=64 time=3.18 ms
64 bytes from 10.10.10.2: icmp_seq=2 ttl=64 time=1.66 ms
64 bytes from 10.10.10.2: icmp_seq=3 ttl=64 time=1.65 ms
64 bytes from 10.10.10.2: icmp_seq=4 ttl=64 time=1.60 ms
^C
--- 10.10.10.2 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3006ms
rtt min/avg/max/mdev = 1.599/2.023/3.180/0.668 ms
root@tegra-ubuntu:/home/denso# iperf3 -c 10.10.10.2
Connecting to host 10.10.10.2, port 5201
[  5] local 10.10.10.1 port 47624 connected to 10.10.10.2 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   418 MBytes  3.51 Gbits/sec    0   1.35 MBytes       
[  5]   1.00-2.00   sec   414 MBytes  3.47 Gbits/sec    0   1.35 MBytes       
[  5]   2.00-3.00   sec   412 MBytes  3.45 Gbits/sec    0   1.35 MBytes       
[  5]   3.00-4.00   sec   408 MBytes  3.43 Gbits/sec    0   1.35 MBytes       
[  5]   4.00-5.00   sec   409 MBytes  3.43 Gbits/sec    0   1.35 MBytes       
[  5]   5.00-6.00   sec   409 MBytes  3.42 Gbits/sec    0   2.09 MBytes       
[  5]   6.00-7.00   sec   410 MBytes  3.45 Gbits/sec    0   2.09 MBytes       
[  5]   7.00-8.00   sec   408 MBytes  3.41 Gbits/sec    0   2.09 MBytes       
[  5]   8.00-9.00   sec   406 MBytes  3.41 Gbits/sec    0   2.09 MBytes       
[  5]   9.00-10.00  sec   406 MBytes  3.41 Gbits/sec    0   2.09 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  4.00 GBytes  3.44 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  4.00 GBytes  3.44 Gbits/sec                  receiver

iperf Done.
root@tegra-ubuntu:/home/denso# iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 10.10.10.2, port 60000
[  5] local 10.10.10.1 port 5201 connected to 10.10.10.2 port 60002
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   774 MBytes  6.49 Gbits/sec                  
[  5]   1.00-2.00   sec   749 MBytes  6.29 Gbits/sec                  
[  5]   2.00-3.00   sec   738 MBytes  6.19 Gbits/sec                  
[  5]   3.00-4.00   sec   750 MBytes  6.29 Gbits/sec                  
[  5]   4.00-5.00   sec   756 MBytes  6.34 Gbits/sec                  
[  5]   5.00-6.00   sec   780 MBytes  6.54 Gbits/sec                  
[  5]   6.00-7.00   sec   791 MBytes  6.64 Gbits/sec                  
[  5]   7.00-8.00   sec   798 MBytes  6.69 Gbits/sec                  
[  5]   8.00-9.00   sec   785 MBytes  6.58 Gbits/sec                  
[  5]   9.00-10.00  sec   772 MBytes  6.48 Gbits/sec                  
[  5]  10.00-10.00  sec  1.50 MBytes  6.08 Gbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.00  sec  7.51 GBytes  6.45 Gbits/sec                  receiver
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
^Ciperf3: interrupt - the server has terminated

root@tegra-ubuntu:/home/denso# dmesg | grep -i pcie
[    7.591710] tegra194-pcie 14100000.pcie: Adding to iommu group 9
[    7.604338] tegra194-pcie 14100000.pcie: Using GICv2m MSI allocator
[    7.611451] tegra194-pcie 14160000.pcie: Adding to iommu group 10
[    7.623846] tegra194-pcie 14160000.pcie: Using GICv2m MSI allocator
[    7.630651] tegra194-pcie 141a0000.pcie: Adding to iommu group 11
[    7.642535] tegra194-pcie 141a0000.pcie: Using GICv2m MSI allocator
[    7.649500] tegra194-pcie 141a0000.pcie: Failed to get slot regulators: -517
[    9.929852] tegra194-pcie 14100000.pcie: Using GICv2m MSI allocator
[    9.944360] tegra194-pcie 14100000.pcie: host bridge /pcie@14100000 ranges:
[    9.951534] tegra194-pcie 14100000.pcie:       IO 0x0030100000..0x00301fffff -> 0x0030100000
[    9.963943] tegra194-pcie 14100000.pcie:      MEM 0x20a8000000..0x20afffffff -> 0x0040000000
[    9.980207] tegra194-pcie 14100000.pcie:      MEM 0x2080000000..0x20a7ffffff -> 0x2080000000
[   10.099975] tegra194-pcie 14100000.pcie: Link up
[   10.106675] tegra194-pcie 14100000.pcie: PCI host bridge to bus 0001:00
[   10.318073] pcieport 0001:00:00.0: Adding to iommu group 9
[   10.330202] pcieport 0001:00:00.0: PME: Signaling with IRQ 51
[   10.336442] pcieport 0001:00:00.0: AER: enabled with IRQ 51
[   10.343009] tegra194-pcie 14160000.pcie: Using GICv2m MSI allocator
[   10.350672] tegra194-pcie 14160000.pcie: host bridge /pcie@14160000 ranges:
[   10.364574] tegra194-pcie 14160000.pcie:       IO 0x0036100000..0x00361fffff -> 0x0036100000
[   10.373269] tegra194-pcie 14160000.pcie:      MEM 0x2428000000..0x242fffffff -> 0x0040000000
[   10.381970] tegra194-pcie 14160000.pcie:      MEM 0x2140000000..0x2427ffffff -> 0x2140000000
[   11.496264] tegra194-pcie 14160000.pcie: Phy link never came up
[   11.502439] tegra194-pcie 14160000.pcie: PCI host bridge to bus 0004:00
[   11.578175] pcieport 0004:00:00.0: Adding to iommu group 10
[   11.584176] pcieport 0004:00:00.0: PME: Signaling with IRQ 53
[   11.590456] pcieport 0004:00:00.0: AER: enabled with IRQ 53
[   11.615584] tegra194-pcie 141a0000.pcie: Using GICv2m MSI allocator
[   11.622152] tegra194-pcie 141a0000.pcie: Failed to get slot regulators: -517
[   11.622247] vdd-3v3-pcie: supplied by vdd-3v3-sys
[   11.681432] tegra194-pcie 141a0000.pcie: Using GICv2m MSI allocator
[   11.797188] tegra194-pcie 141a0000.pcie: host bridge /pcie@141a0000 ranges:
[   11.804375] tegra194-pcie 141a0000.pcie:       IO 0x003a100000..0x003a1fffff -> 0x003a100000
[   11.813055] tegra194-pcie 141a0000.pcie:      MEM 0x2b28000000..0x2b2fffffff -> 0x0040000000
[   11.821733] tegra194-pcie 141a0000.pcie:      MEM 0x2740000000..0x2b27ffffff -> 0x2740000000
[   11.936120] tegra194-pcie 141a0000.pcie: Link up
[   11.943017] tegra194-pcie 141a0000.pcie: PCI host bridge to bus 0005:00
[   12.036410] pci 0005:01:00.0: 63.008 Gb/s available PCIe bandwidth, limited by 8.0 GT/s PCIe x8 link at 0005:00:00.0 (capable of 126.024 Gb/s with 16.0 GT/s PCIe x8 link)
[   12.134372] pcieport 0005:00:00.0: Adding to iommu group 11
[   12.140347] pcieport 0005:00:00.0: PME: Signaling with IRQ 55
[   12.146620] pcieport 0005:00:00.0: AER: enabled with IRQ 55
root@tegra-ubuntu:/home/denso# 

For the performance issue, what is the lspci -vvv result on RC side?

And for the PCIe ethernet, could you share what is the diff between your orin and xavier?

I mean did they apply same patches or not?

Hi @WayneWWW

I apply the same patches both on Orin and Xavier. I mean that I used the same image built on my side, and just assigned different .conf when flash them.

Xavier flash command:

./flash jetson-agx-xavier-devkit mmcblk0p1

Orin flash command

./flash jetson-agx-orin-devkit mmcblk0p1

The lspci -vvv result on RC side:

root@tegra-ubuntu:/home/denso# lspci -vvv
0001:00:00.0 PCI bridge: NVIDIA Corporation Device 229e (rev a1) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 51
	Bus: primary=00, secondary=01, subordinate=ff, sec-latency=0
	I/O behind bridge: 00001000-00001fff [size=4K]
	Memory behind bridge: 40000000-400fffff [size=1M]
	Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff [disabled]
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
	BridgeCtl: Parity- SERR+ NoISA- VGA- VGA16- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Capabilities: [70] Express (v2) Root Port (Slot-), MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0
			ExtTag- RBE+
		DevCtl:	CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 128 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 16GT/s, Width x1, ASPM not supported
			ClockPM- Surprise+ LLActRep+ BwNot+ ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt+ AutBWInt-
		LnkSta:	Speed 2.5GT/s (downgraded), Width x1 (ok)
			TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt-
		RootCap: CRSVisible+
		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible+
		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
		DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, NROPrPrP+, LTR+
			 10BitTagComp+, 10BitTagReq-, OBFF Not Supported, ExtFmt-, EETLPPrefix-
			 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
			 FRS-, LN System CLS Not Supported, TPHComp-, ExtTPHComp-, ARIFwd+
			 AtomicOpsCap: Routing- 32bit- 64bit- 128bitCAS-
		DevCtl2: Completion Timeout: 1ms to 10ms, TimeoutDis-, LTR+, OBFF Disabled ARIFwd-
			 AtomicOpsCtl: ReqEn- EgressBlck-
		LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [b0] MSI-X: Enable- Count=1 Masked-
		Vector table: BAR=0 offset=00000000
		PBA: BAR=0 offset=00000000
	Capabilities: [100 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
		AERCap:	First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
			MultHdrRecCap+ MultHdrRecEn- TLPPfxPres- HdrLogCap-
		HeaderLog: 00000000 00000000 00000000 00000000
		RootCmd: CERptEn+ NFERptEn+ FERptEn+
		RootSta: CERcvd- MultCERcvd- UERcvd- MultUERcvd-
			 FirstFatal- NonFatalMsg- FatalMsg- IntMsg 0
		ErrorSrc: ERR_COR: 0000 ERR_FATAL/NONFATAL: 0000
	Capabilities: [148 v1] Secondary PCI Express
		LnkCtl3: LnkEquIntrruptEn-, PerformEqu-
		LaneErrStat: 0
	Capabilities: [158 v1] Physical Layer 16.0 GT/s <?>
	Capabilities: [17c v1] Lane Margining at the Receiver <?>
	Capabilities: [190 v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2- ASPM_L1.1- L1_PM_Substates+
			  PortCommonModeRestoreTime=60us PortTPowerOnTime=40us
		L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
			   T_CommonMode=10us
		L1SubCtl2: T_PwrOn=10us
	Capabilities: [1a0 v1] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?>
	Capabilities: [2a0 v1] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
	Capabilities: [2d8 v1] Data Link Feature <?>
	Capabilities: [2e4 v1] Precision Time Measurement
		PTMCap: Requester:- Responder:+ Root:+
		PTMClockGranularity: 16ns
		PTMControl: Enabled:- RootSelected:-
		PTMEffectiveGranularity: Unknown
	Capabilities: [2f0 v1] Vendor Specific Information: ID=0004 Rev=1 Len=054 <?>
	Capabilities: [358 v1] Vendor Specific Information: ID=0006 Rev=0 Len=018 <?>
	Kernel driver in use: pcieport

0001:01:00.0 Network controller: Realtek Semiconductor Co., Ltd. RTL8822CE 802.11ac PCIe Wireless Network Adapter
	Subsystem: AzureWave RTL8822CE 802.11ac PCIe Wireless Network Adapter
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 32 bytes
	Interrupt: pin A routed to IRQ 313
	Region 0: I/O ports at 1000 [size=256]
	Region 2: Memory at 20a8000000 (64-bit, non-prefetchable) [size=64K]
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
		Address: 000000000f410040  Data: 0262
	Capabilities: [70] Express (v2) Endpoint, MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <4us, L1 <64us
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
		DevCtl:	CorrErr- NonFatalErr- FatalErr- UnsupReq-
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <4us, L1 <64us
			ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk-
			ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s (ok), Width x1 (ok)
			TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Not Supported, TimeoutDis+, NROPrPrP-, LTR+
			 10BitTagComp-, 10BitTagReq-, OBFF Via message/WAKE#, ExtFmt-, EETLPPrefix-
			 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
			 FRS-, TPHComp-, ExtTPHComp-
			 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis+, LTR+, OBFF Disabled
			 AtomicOpsCtl: ReqEn-
		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [100 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
		AERCap:	First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
			MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
		HeaderLog: 00000000 00000000 00000000 00000000
	Capabilities: [148 v1] Device Serial Number 00-e0-4c-ff-fe-c8-22-01
	Capabilities: [158 v1] Latency Tolerance Reporting
		Max snoop latency: 0ns
		Max no snoop latency: 0ns
	Capabilities: [160 v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
			  PortCommonModeRestoreTime=30us PortTPowerOnTime=60us
		L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
			   T_CommonMode=0us LTR1.2_Threshold=0ns
		L1SubCtl2: T_PwrOn=10us
	Kernel driver in use: rtl88x2ce
	Kernel modules: rtl8822ce

0005:00:00.0 PCI bridge: NVIDIA Corporation Device 229a (rev a1) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 55
	Bus: primary=00, secondary=01, subordinate=ff, sec-latency=0
	I/O behind bridge: 0000f000-00000fff [disabled]
	Memory behind bridge: 40000000-405fffff [size=6M]
	Prefetchable memory behind bridge: 0000002740000000-00000027400fffff [size=1M]
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
	BridgeCtl: Parity- SERR+ NoISA- VGA- VGA16- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
		Address: 0000000000000000  Data: 0000
		Masking: 00000000  Pending: 00000000
	Capabilities: [70] Express (v2) Root Port (Slot-), MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0
			ExtTag- RBE+
		DevCtl:	CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 256 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 16GT/s, Width x8, ASPM not supported
			ClockPM- Surprise+ LLActRep+ BwNot+ ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt+ AutBWInt-
		LnkSta:	Speed 8GT/s (downgraded), Width x8 (ok)
			TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt-
		RootCap: CRSVisible+
		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible+
		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
		DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, NROPrPrP+, LTR+
			 10BitTagComp+, 10BitTagReq-, OBFF Not Supported, ExtFmt-, EETLPPrefix-
			 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
			 FRS-, LN System CLS Not Supported, TPHComp-, ExtTPHComp-, ARIFwd+
			 AtomicOpsCap: Routing- 32bit- 64bit- 128bitCAS-
		DevCtl2: Completion Timeout: 1ms to 10ms, TimeoutDis-, LTR+, OBFF Disabled ARIFwd-
			 AtomicOpsCtl: ReqEn- EgressBlck-
		LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+
			 EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
	Capabilities: [b0] MSI-X: Enable- Count=8 Masked-
		Vector table: BAR=2 offset=00000000
		PBA: BAR=2 offset=00010000
	Capabilities: [100 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
		AERCap:	First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
			MultHdrRecCap+ MultHdrRecEn- TLPPfxPres- HdrLogCap-
		HeaderLog: 00000000 00000000 00000000 00000000
		RootCmd: CERptEn+ NFERptEn+ FERptEn+
		RootSta: CERcvd- MultCERcvd- UERcvd- MultUERcvd-
			 FirstFatal- NonFatalMsg- FatalMsg- IntMsg 0
		ErrorSrc: ERR_COR: 0000 ERR_FATAL/NONFATAL: 0000
	Capabilities: [148 v1] Secondary PCI Express
		LnkCtl3: LnkEquIntrruptEn-, PerformEqu-
		LaneErrStat: 0
	Capabilities: [168 v1] Physical Layer 16.0 GT/s <?>
	Capabilities: [190 v1] Lane Margining at the Receiver <?>
	Capabilities: [1c0 v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2- ASPM_L1.1- L1_PM_Substates+
			  PortCommonModeRestoreTime=60us PortTPowerOnTime=40us
		L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
			   T_CommonMode=60us
		L1SubCtl2: T_PwrOn=10us
	Capabilities: [1d0 v1] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?>
	Capabilities: [2d0 v1] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
	Capabilities: [308 v1] Data Link Feature <?>
	Capabilities: [314 v1] Precision Time Measurement
		PTMCap: Requester:+ Responder:+ Root:+
		PTMClockGranularity: 16ns
		PTMControl: Enabled:- RootSelected:-
		PTMEffectiveGranularity: Unknown
	Capabilities: [320 v1] Vendor Specific Information: ID=0004 Rev=1 Len=054 <?>
	Capabilities: [388 v1] Vendor Specific Information: ID=0006 Rev=0 Len=018 <?>
	Kernel driver in use: pcieport

0005:01:00.0 Network controller: NVIDIA Corporation Device 2296
	Subsystem: NVIDIA Corporation Device 0000
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 55
	Region 0: Memory at 2b28000000 (32-bit, non-prefetchable) [size=4M]
	Region 2: Memory at 2740000000 (64-bit, prefetchable) [size=128K]
	Region 4: Memory at 2b28400000 (64-bit, non-prefetchable) [size=1M]
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [50] MSI: Enable- Count=1/16 Maskable+ 64bit+
		Address: 0000000000000000  Data: 0000
		Masking: 00000000  Pending: 00000000
	Capabilities: [70] Express (v2) Endpoint, MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
		DevCtl:	CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 256 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 16GT/s, Width x8, ASPM not supported
			ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk-
			ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 8GT/s (downgraded), Width x8 (ok)
			TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, NROPrPrP-, LTR+
			 10BitTagComp+, 10BitTagReq-, OBFF Not Supported, ExtFmt-, EETLPPrefix-
			 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
			 FRS-, TPHComp-, ExtTPHComp-
			 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled
			 AtomicOpsCtl: ReqEn-
		LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+
			 EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
	Capabilities: [b0] MSI-X: Enable+ Count=8 Masked-
		Vector table: BAR=2 offset=00000000
		PBA: BAR=2 offset=00010000
	Capabilities: [100 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
		AERCap:	First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
			MultHdrRecCap+ MultHdrRecEn- TLPPfxPres- HdrLogCap-
		HeaderLog: 00000000 00000000 00000000 00000000
	Capabilities: [148 v1] Secondary PCI Express
		LnkCtl3: LnkEquIntrruptEn-, PerformEqu-
		LaneErrStat: 0
	Capabilities: [168 v1] Physical Layer 16.0 GT/s <?>
	Capabilities: [190 v1] Lane Margining at the Receiver <?>
	Capabilities: [1b8 v1] Latency Tolerance Reporting
		Max snoop latency: 0ns
		Max no snoop latency: 0ns
	Capabilities: [1c0 v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2- ASPM_L1.1- L1_PM_Substates+
			  PortCommonModeRestoreTime=60us PortTPowerOnTime=40us
		L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
			   T_CommonMode=0us
		L1SubCtl2: T_PwrOn=10us
	Capabilities: [1d0 v1] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?>
	Capabilities: [2d0 v1] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
	Capabilities: [308 v1] Data Link Feature <?>
	Capabilities: [314 v1] Precision Time Measurement
		PTMCap: Requester:+ Responder:- Root:-
		PTMClockGranularity: Unimplemented
		PTMControl: Enabled:- RootSelected:-
		PTMEffectiveGranularity: Unknown
	Capabilities: [320 v1] Vendor Specific Information: ID=0003 Rev=1 Len=054 <?>
	Kernel driver in use: tvnet

Hi @WayneWWW

I found the root cause for this issue, the PCIe vnet driver has not assigned the max_mtu when it probe/bind. You can refer to the patches below.
But, the transmission performance is still not good, especially from the RC side to the EP side. Is it the limitation of NV’s PCIE vnet driver?

Performance Test EP side:

root@tegra-ubuntu:/sys/kernel/config/pci_ep# iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 10.10.10.1, port 38200
[  5] local 10.10.10.2 port 5201 connected to 10.10.10.1 port 38202
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   478 MBytes  4.01 Gbits/sec                  
[  5]   1.00-2.00   sec   491 MBytes  4.12 Gbits/sec                  
[  5]   2.00-3.00   sec   502 MBytes  4.21 Gbits/sec                  
[  5]   3.00-4.00   sec   524 MBytes  4.39 Gbits/sec                  
[  5]   4.00-5.00   sec   542 MBytes  4.54 Gbits/sec                  
[  5]   5.00-6.00   sec   547 MBytes  4.59 Gbits/sec                  
[  5]   6.00-7.00   sec   550 MBytes  4.61 Gbits/sec                  
[  5]   7.00-8.00   sec   548 MBytes  4.60 Gbits/sec                  
[  5]   8.00-9.00   sec   547 MBytes  4.59 Gbits/sec                  
[  5]   9.00-10.00  sec   547 MBytes  4.59 Gbits/sec                  
[  5]  10.00-10.00  sec  1.00 MBytes  4.20 Gbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.00  sec  5.15 GBytes  4.43 Gbits/sec                  receiver
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 10.10.10.1, port 38210
[  5] local 10.10.10.2 port 5201 connected to 10.10.10.1 port 38212
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   272 MBytes  2.28 Gbits/sec                  
[  5]   1.00-2.00   sec   274 MBytes  2.30 Gbits/sec                  
[  5]   2.00-3.00   sec   518 MBytes  4.35 Gbits/sec                  
[  5]   3.00-4.00   sec   543 MBytes  4.56 Gbits/sec                  
[  5]   4.00-5.00   sec   544 MBytes  4.57 Gbits/sec                  
[  5]   5.00-6.00   sec   544 MBytes  4.56 Gbits/sec                  
[  5]   6.00-7.00   sec   543 MBytes  4.56 Gbits/sec                  
[  5]   7.00-8.00   sec   543 MBytes  4.55 Gbits/sec                  
[  5]   8.00-9.00   sec   544 MBytes  4.56 Gbits/sec                  
[  5]   9.00-10.00  sec   545 MBytes  4.57 Gbits/sec                  
[  5]  10.00-10.00  sec   896 KBytes  4.33 Gbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.00  sec  4.76 GBytes  4.09 Gbits/sec                  receiver
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
^Ciperf3: interrupt - the server has terminated
root@tegra-ubuntu:/sys/kernel/config/pci_ep# iperf3 -c 10.10.10.1
Connecting to host 10.10.10.1, port 5201
[  5] local 10.10.10.2 port 44478 connected to 10.10.10.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   761 MBytes  6.38 Gbits/sec    0   3.07 MBytes       
[  5]   1.00-2.00   sec   785 MBytes  6.58 Gbits/sec    0   3.07 MBytes       
[  5]   2.00-3.00   sec   792 MBytes  6.64 Gbits/sec    0   3.07 MBytes       
[  5]   3.00-4.00   sec   794 MBytes  6.66 Gbits/sec    0   3.07 MBytes       
[  5]   4.00-5.00   sec   794 MBytes  6.66 Gbits/sec    0   3.07 MBytes       
[  5]   5.00-6.00   sec   798 MBytes  6.69 Gbits/sec    0   3.07 MBytes       
[  5]   6.00-7.00   sec   802 MBytes  6.73 Gbits/sec    0   3.07 MBytes       
[  5]   7.00-8.00   sec   804 MBytes  6.74 Gbits/sec    0   3.07 MBytes       
[  5]   8.00-9.00   sec   804 MBytes  6.74 Gbits/sec    0   3.07 MBytes       
[  5]   9.00-10.00  sec   805 MBytes  6.75 Gbits/sec    0   3.07 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  7.75 GBytes  6.66 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  7.75 GBytes  6.66 Gbits/sec                  receiver

iperf Done.
root@tegra-ubuntu:/sys/kernel/config/pci_ep# iperf3 -c 10.10.10.1
Connecting to host 10.10.10.1, port 5201
[  5] local 10.10.10.2 port 44512 connected to 10.10.10.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   783 MBytes  6.57 Gbits/sec    0   3.14 MBytes       
[  5]   1.00-2.00   sec   801 MBytes  6.72 Gbits/sec    0   3.14 MBytes       
[  5]   2.00-3.00   sec   804 MBytes  6.74 Gbits/sec    0   3.14 MBytes       
[  5]   3.00-4.00   sec   796 MBytes  6.67 Gbits/sec    0   3.14 MBytes       
[  5]   4.00-5.00   sec   798 MBytes  6.70 Gbits/sec    0   3.14 MBytes       
[  5]   5.00-6.00   sec   801 MBytes  6.71 Gbits/sec    0   3.14 MBytes       
[  5]   6.00-7.00   sec   799 MBytes  6.71 Gbits/sec    0   3.14 MBytes       
[  5]   7.00-8.00   sec   800 MBytes  6.71 Gbits/sec    0   3.14 MBytes       
[  5]   8.00-9.00   sec   799 MBytes  6.70 Gbits/sec    0   3.14 MBytes       
[  5]   9.00-10.00  sec   799 MBytes  6.70 Gbits/sec    0   3.14 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  7.79 GBytes  6.69 Gbits/sec    0             sender
[  5]   0.00-10.01  sec  7.79 GBytes  6.69 Gbits/sec                  receiver

iperf Done.

Performance Test RC side:

root@tegra-ubuntu:/home/denso# iperf3 -c 10.10.10.2
Connecting to host 10.10.10.2, port 5201
[  5] local 10.10.10.1 port 38202 connected to 10.10.10.2 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   479 MBytes  4.01 Gbits/sec    0   1.23 MBytes       
[  5]   1.00-2.00   sec   490 MBytes  4.12 Gbits/sec    0   1.29 MBytes       
[  5]   2.00-3.00   sec   502 MBytes  4.21 Gbits/sec    0   1.29 MBytes       
[  5]   3.00-4.00   sec   524 MBytes  4.39 Gbits/sec    0   2.03 MBytes       
[  5]   4.00-5.00   sec   541 MBytes  4.54 Gbits/sec    0   2.03 MBytes       
[  5]   5.00-6.00   sec   548 MBytes  4.59 Gbits/sec    0   2.03 MBytes       
[  5]   6.00-7.00   sec   550 MBytes  4.61 Gbits/sec    0   2.03 MBytes       
[  5]   7.00-8.00   sec   548 MBytes  4.60 Gbits/sec    0   2.03 MBytes       
[  5]   8.00-9.00   sec   548 MBytes  4.59 Gbits/sec    0   2.03 MBytes       
[  5]   9.00-10.00  sec   548 MBytes  4.59 Gbits/sec    0   2.03 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  5.15 GBytes  4.43 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  5.15 GBytes  4.43 Gbits/sec                  receiver

iperf Done.
root@tegra-ubuntu:/home/denso# iperf3 -c 10.10.10.2
Connecting to host 10.10.10.2, port 5201
[  5] local 10.10.10.1 port 38212 connected to 10.10.10.2 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   272 MBytes  2.28 Gbits/sec    0   1.35 MBytes       
[  5]   1.00-2.00   sec   275 MBytes  2.30 Gbits/sec    0   1.35 MBytes       
[  5]   2.00-3.00   sec   519 MBytes  4.36 Gbits/sec    0   1.35 MBytes       
[  5]   3.00-4.00   sec   542 MBytes  4.56 Gbits/sec    0   1.35 MBytes       
[  5]   4.00-5.00   sec   545 MBytes  4.57 Gbits/sec    0   1.35 MBytes       
[  5]   5.00-6.00   sec   542 MBytes  4.56 Gbits/sec    0   1.35 MBytes       
[  5]   6.00-7.00   sec   544 MBytes  4.56 Gbits/sec    0   1.35 MBytes       
[  5]   7.00-8.00   sec   542 MBytes  4.55 Gbits/sec    0   1.35 MBytes       
[  5]   8.00-9.00   sec   544 MBytes  4.56 Gbits/sec    0   1.35 MBytes       
[  5]   9.00-10.00  sec   546 MBytes  4.57 Gbits/sec    0   1.48 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  4.76 GBytes  4.09 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  4.76 GBytes  4.09 Gbits/sec                  receiver

iperf Done.
root@tegra-ubuntu:/home/denso# iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 10.10.10.2, port 44476
[  5] local 10.10.10.1 port 5201 connected to 10.10.10.2 port 44478
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   756 MBytes  6.34 Gbits/sec                  
[  5]   1.00-2.00   sec   785 MBytes  6.59 Gbits/sec                  
[  5]   2.00-3.00   sec   792 MBytes  6.64 Gbits/sec                  
[  5]   3.00-4.00   sec   794 MBytes  6.66 Gbits/sec                  
[  5]   4.00-5.00   sec   793 MBytes  6.66 Gbits/sec                  
[  5]   5.00-6.00   sec   798 MBytes  6.69 Gbits/sec                  
[  5]   6.00-7.00   sec   803 MBytes  6.74 Gbits/sec                  
[  5]   7.00-8.00   sec   804 MBytes  6.74 Gbits/sec                  
[  5]   8.00-9.00   sec   804 MBytes  6.74 Gbits/sec                  
[  5]   9.00-10.00  sec   805 MBytes  6.76 Gbits/sec                  
[  5]  10.00-10.00  sec  3.90 MBytes  8.10 Gbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.00  sec  7.75 GBytes  6.66 Gbits/sec                  receiver
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 10.10.10.2, port 44510
[  5] local 10.10.10.1 port 5201 connected to 10.10.10.2 port 44512
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   779 MBytes  6.53 Gbits/sec                  
[  5]   1.00-2.00   sec   801 MBytes  6.72 Gbits/sec                  
[  5]   2.00-3.00   sec   804 MBytes  6.74 Gbits/sec                  
[  5]   3.00-4.00   sec   796 MBytes  6.67 Gbits/sec                  
[  5]   4.00-5.00   sec   799 MBytes  6.70 Gbits/sec                  
[  5]   5.00-6.00   sec   800 MBytes  6.71 Gbits/sec                  
[  5]   6.00-7.00   sec   799 MBytes  6.71 Gbits/sec                  
[  5]   7.00-8.00   sec   799 MBytes  6.71 Gbits/sec                  
[  5]   8.00-9.00   sec   799 MBytes  6.71 Gbits/sec                  
[  5]   9.00-10.00  sec   798 MBytes  6.70 Gbits/sec                  
[  5]  10.00-10.01  sec  4.53 MBytes  7.46 Gbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.01  sec  7.79 GBytes  6.69 Gbits/sec                  receiver
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
^Ciperf3: interrupt - the server has terminated

PCIe vnet driver patches:

diff --git a/drivers/net/ethernet/nvidia/pcie/tegra_vnet.c b/drivers/net/ethernet/nvidia/pcie/tegra_vnet.c
index d4512ce9a..eb5d7c861 100644
--- a/drivers/net/ethernet/nvidia/pcie/tegra_vnet.c
+++ b/drivers/net/ethernet/nvidia/pcie/tegra_vnet.c
@@ -803,6 +803,7 @@ static int tvnet_host_probe(struct pci_dev *pdev,
        netif_napi_add(ndev, &tvnet->napi, tvnet_host_poll, TVNET_NAPI_WEIGHT);
 
        ndev->mtu = TVNET_DEFAULT_MTU;
+       ndev->max_mtu = TVNET_MAX_MTU;
 
        ret = register_netdev(ndev);
        if (ret) {
diff --git a/drivers/pci/endpoint/functions/pci-epf-tegra-vnet.c b/drivers/pci/endpoint/functions/pci-epf-tegra-vnet.c
index 834081439..5c5890540 100644
--- a/drivers/pci/endpoint/functions/pci-epf-tegra-vnet.c
+++ b/drivers/pci/endpoint/functions/pci-epf-tegra-vnet.c
@@ -1518,6 +1518,7 @@ static int tvnet_ep_pci_epf_bind(struct pci_epf *epf)
        netif_napi_add(ndev, &tvnet->napi, tvnet_ep_poll, TVNET_NAPI_WEIGHT);
 
        ndev->mtu = TVNET_DEFAULT_MTU;
+       ndev->max_mtu = TVNET_MAX_MTU;
 
        ret = register_netdev(ndev);
        if (ret < 0) {

Are you sure this issue is related to that max mtu setting? I really doubt about it…

Hi @WayneWWW

I’m not sure this method can be applied to others but on my side.
Before I apply this change, I perform some tests with the original JP_5.1 below:

  1. The default MTU of VNET is 64512
  2. The actual MUT range of VNET is 68 to 1500(confirmed with ifconfig)
  3. When I change the MTU to 1500 manually with ifconfig on both sides (RC and EP), then it works(ping and iperf)

According to these points, I reviewed the VNET drivers and found that the MTU range is different from the definition in the drivers. Due to this reason, I tried to add these changes to VNET initialization. And it’s workable on Xavier and Orin platforms currently.

But even then, I still can’t figure out why JP_4.6.3 works without this change.

Screenshot from 2023-06-20 09-34-56

Hi @WayneWWW

By the way, regarding the issue that the VNET performance is not good between RC and EP sides(especially from RC to EP), do you have any ideas?


According to these points, I reviewed the VNET drivers and found that the MTU range is different from the definition in the drivers. Due to this reason, I tried to add these changes to VNET initialization. And it’s workable on Xavier and Orin platforms currently.

Is it based on xavier to xavier connection or orin to xavier connection?

Also, when you firstly changed the MTU to 1500 and make it work, did you already apply the MAX MTU patch? or it was not there?

And did you disable other things like network manager and enable any network settings at that time?

Hi @WayneWWW

The test is based on Xavier to Xavier and has not applied the MAX MTU patch yet.
The network settings are following what you mentioned before:

  1. disable NetworkManager
  2. enable netplan

BTW, the Orin to Xavier is like I test before, it was workable originally :

It looks like a regression on rel-35.3.1 and we are checking it.

Hi @WayneWWW

Is this include the performance issue transmission from RC to EP?

No, performance may not be related.

Hi @WayneWWW

Is there any update on the transmission performance issue of the PCIe EP and RC?

Hi,

Sorry, this has no update.
Also, this issue totally has nothing with performance… We will only focus on the not running case.
The performance will not be discussed until that one got fixed.


目前還沒有更新
另外, 這條跟performance沒有關係… performance要有東西跑起來才能算… 現在的問題是跑不起來…
目前我們只會處理跑不起來的問題 . peformance會留到之後討論

Hi,

This shall get work on jp5.1.2.

Hi @WayneWWW

I have tried JP 5.1.2, but it still does not work.

PCIe RC : Jetson AGX Xavier Devkit
[Snapshot]

[Kernel Message]
xavier_jp512_pcie_rc_dmesg.txt (66.8 KB)

PCIe EP : Jetson AGX Orin Devkit
[Snapshot]

[Kernel Message]
orin_jp512_pcie_ep_dmesg.txt (70.3 KB)