The ethernet interface over PCIe(endpoint mode) not work

Are you saying that you can use this section to write the shared RAM on the EP?

https://docs.nvidia.com/jetson/archives/r35.2.1/DeveloperGuide/text/SD/Communications/PcieEndpointMode.html#accessing-the-shared-ram-on-the-endpoint-device

btw, could you upgrade to 35.3.1? We don’t directly debug on old release.

Yes, I can write and read the share memory on EP side.

OK, I’ll try to upgrade to 35.3.1, and update the result after tested.

Hi,

I have tried version 35.3.1, the test result is the same as version 35.2.1.

  1. Read/Write share memory on EP side is successful
  2. Virtual network can be established, but can’t communicate(ping).

Hello,

I tried JetPack 4.6.3 + Jetson Linux 32.7.3 (Kernel 4.9.299), and the virtual network can be established, and communication is successful. If JetPack 5.2.1/5.3.1 is a must, Do you have any idea that can solve this issue?

By the way, the performance seems not good, the average bandwidth is about 2.x Gbits (under PCIe x8).

EP side:

RC side:

thanks,

現在的狀況是我們還要去複製看看能不能在jp5碰到你這個問題. 這整個流程可能會很慢.

如果你很急的話麻煩就先用jp4.6.3.

Hi hermes_wu,

Please try below command to disable network manager:

$ sudo systemctl stop NetworkManager.service
$ sudo systemctl disable NetworkManager.service

Add the patch on EP and set static IP from both side.
You can reference detail steps from: Jetpack5.0.2 Xavier pcie endpoint mode [repetition] - #9 by WayneWWW

Hi @carolyuu,

I tried to disable the network manager for both EP and RC, but they still can’t ping each other.
Jetpack : 5.1
Jetson Linux : 35.2.1

EP Log:

root@tegra-ubuntu:/home/denso# systemctl status NetworkManager.service
● NetworkManager.service - Network Manager
     Loaded: loaded (/lib/systemd/system/NetworkManager.service; disabled; vend>
     Active: inactive (dead)
       Docs: man:NetworkManager(8)
root@tegra-ubuntu:/home/denso# cd /sys/kernel/config/pci_ep/                   1
ho 16 > functions/pci_epf_tvnet/func1/msi_interrupts_ep# mkdir functions/pci_epf_tvnet/func1 
ln -s functions/pci_epf_tvnet/func1 controllersroot@tegra-ubuntu:/sys/kernel/config/pci_ep# echo 16 > functions/pci_epf_tvnet/func1/msi_interrupts
p/
echo root@tegra-ubuntu:/sys/kernel/config/pci_ep# ln -s functions/pci_epf_tvnet/func1 controllers/141a0000.pcie_ep/
141a0000.pcie_ep/start
root@tegra-ubuntu:/sys/kernel/config/pci_ep# echo 1 > controllers/141a0000.pcie_ep/start
root@tegra-ubuntu:/sys/kernel/config/pci_ep# [   42.849523] pci_epf_tvnet pci_epf_tvnet.0: tvnet_ep_open: PCIe link is not up

root@tegra-ubuntu:/sys/kernel/config/pci_ep# [   51.278141] tegra194-pcie 141a0000.pcie_ep: LTSSM state: 0x2018 timeout: -110

root@tegra-ubuntu:/sys/kernel/config/pci_ep# ifconfig eth1 up
root@tegra-ubuntu:/sys/kernel/config/pci_ep# [  127.283921] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  127.284143] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  127.285247] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  127.285431] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  127.289252] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  127.289525] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  127.536388] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  127.536599] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  127.786675] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  127.786905] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!

root@tegra-ubuntu:/sys/kernel/config/pci_ep# ifconfig eth1
eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 64512
        inet 10.10.10.2  netmask 255.255.255.0  broadcast 10.10.10.255
        inet6 fe80::a8d5:caff:feb3:b36d  prefixlen 64  scopeid 0x20<link>
        ether aa:d5:ca:b3:b3:6d  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

root@tegra-ubuntu:/sys/kernel/config/pci_ep# ping 10.10.10.3
PING 10.10.10.3 (10.10.10.3) 56(84) bytes of data.

^C
--- 10.10.10.3 ping statistics ---
33 packets transmitted, 0 received, 100% packet loss, time 32762ms

root@tegra-ubuntu:/sys/kernel/config/pci_ep# 

RC Log:

root@tegra-ubuntu:/home/denso# systemctl status NetworkManager.service
● NetworkManager.service - Network Manager
     Loaded: loaded (/lib/systemd/system/NetworkManager.service; disabled; vend>
     Active: inactive (dead)
       Docs: man:NetworkManager(8)
root@tegra-ubuntu:/home/denso# ifconfig eth0 up
root@tegra-ubuntu:/home/denso# dmesg | grep eth
[    0.000000] psci: probing for conduit method from DT.
[    2.863424] usbcore: registered new interface driver cdc_ether
[    4.453991] optee: probing for conduit method.
[    8.553929] nvethernet 2490000.ethernet: Adding to iommu group 30
[    8.563015] nvethernet 2490000.ethernet: failed to read skip mac reset flag, default 0
[    8.578455] nvethernet 2490000.ethernet: failed to read MDIO address
[    8.590126] nvethernet 2490000.ethernet: setting to default DMA bit mask
[    8.604906] nvethernet 2490000.ethernet: set default TXQ to TC mapping
[    8.623601] nvethernet 2490000.ethernet: Setting default PTP RX queue
[    8.629956] nvethernet 2490000.ethernet: Failed to read DMA Tx ring size, using default [1024]
[    8.638884] nvethernet 2490000.ethernet: Failed to read DMA Tx ring size, using default [1024]
[    8.652044] nvethernet 2490000.ethernet: failed to get eqos_rx_m clk
[    8.653571] nvethernet 2490000.ethernet: failed to get eqos_rx_input clk
[    8.664522] nvethernet 2490000.ethernet: failed to get eqos_tx_divider clk
[    8.665631] nvethernet 2490000.ethernet: Ethernet MAC address: 48:b0:2d:3c:83:66
[    8.696109] nvethernet 2490000.ethernet: Macsec not supported
[    8.703267] nvethernet 2490000.ethernet: eth1 (HW ver: 50) created with 1 DMA channels
[   12.250407] using random self ethernet address
[   12.250554] using random host ethernet address
[   13.670201] using random self ethernet address
[   13.670329] using random host ethernet address
[   53.603889] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
root@tegra-ubuntu:/home/denso# ifconfig eth0
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 64512
        inet 10.10.10.3  netmask 255.255.255.0  broadcast 10.10.10.255
        inet6 fe80::78ba:9dff:fe8d:6634  prefixlen 64  scopeid 0x20<link>
        ether 7a:ba:9d:8d:66:34  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

root@tegra-ubuntu:/home/denso# ping 10.10.10.2
PING 10.10.10.2 (10.10.10.2) 56(84) bytes of data.
From 10.10.10.3 icmp_seq=1 Destination Host Unreachable
From 10.10.10.3 icmp_seq=2 Destination Host Unreachable
From 10.10.10.3 icmp_seq=3 Destination Host Unreachable
From 10.10.10.3 icmp_seq=4 Destination Host Unreachable
From 10.10.10.3 icmp_seq=5 Destination Host Unreachable
From 10.10.10.3 icmp_seq=6 Destination Host Unreachable
From 10.10.10.3 icmp_seq=7 Destination Host Unreachable
From 10.10.10.3 icmp_seq=8 Destination Host Unreachable
From 10.10.10.3 icmp_seq=9 Destination Host Unreachable
From 10.10.10.3 icmp_seq=10 Destination Host Unreachable
From 10.10.10.3 icmp_seq=11 Destination Host Unreachable
From 10.10.10.3 icmp_seq=12 Destination Host Unreachable
From 10.10.10.3 icmp_seq=13 Destination Host Unreachable
^C
--- 10.10.10.2 ping statistics ---
17 packets transmitted, 0 received, +13 errors, 100% packet loss, time 16385ms
pipe 4
root@tegra-ubuntu:/home/denso# 

Not only disabling NM, you need to set throught netplan as the topic

Hi @WayneWWW ,

Yes, I’m set the static IP through Netplan already.

Hello @WayneWWW ,

Do you have any other suggestions that I can verify this issue?

Hi @WayneWWW ,

I conducted a comparison test between JetPack 4.6.3 and 5.1, and the results are shown in the table below.

According to this table, the issue seems to reside on the Root complex side of JetPack v5.1.

Hi @hermes_wu

Yes, I noticed this issue. Will check what is going on.

BTW, could you help me check the functionality of jp5.0.2 and jp5.0.2? I remember this was working on my side in previous setup (NM disabled/netplan setup).

Hi @WayneWWW

I have checked the functionality of Jetpack 5.0.2 and the result is the same as Jetpack 5.1.

Hi @WayneWWW

I tried using Wireshark to capture Ethernet packets on the EP side and found that the Ethernet driver received the packets sent from the RC side. But it seems no response on the network layer(No response for ARP). Do you have any idea about this situation? The relation information is attached below.

EP : Jetpack 5.1
RC : Jetpack 5.1

EP side test:

root@tegra-ubuntu:/sys/kernel/config/pci_ep# ifconfig eth2
eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 64512
        inet 10.10.10.2  netmask 255.255.255.0  broadcast 10.10.10.255
        inet6 fe80::860:6ff:fe52:ec10  prefixlen 64  scopeid 0x20<link>
        ether 0a:60:06:52:ec:10  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

root@tegra-ubuntu:/sys/kernel/config/pci_ep# ping 10.10.10.1
PING 10.10.10.1 (10.10.10.1) 56(84) bytes of data.
From 10.10.10.2 icmp_seq=1 Destination Host Unreachable
From 10.10.10.2 icmp_seq=2 Destination Host Unreachable
From 10.10.10.2 icmp_seq=5 Destination Host Unreachable
From 10.10.10.2 icmp_seq=6 Destination Host Unreachable
^C
--- 10.10.10.1 ping statistics ---
7 packets transmitted, 0 received, +4 errors, 100% packet loss, time 6145ms
pipe 4
root@tegra-ubuntu:/sys/kernel/config/pci_ep# 

RC side test:

root@tegra-ubuntu:/home/denso# ifconfig eth0
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 64512
        inet 10.10.10.1  netmask 255.255.255.0  broadcast 10.10.10.255
        inet6 fe80::e060:68ff:fea3:3a83  prefixlen 64  scopeid 0x20<link>
        ether e2:60:68:a3:3a:83  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

root@tegra-ubuntu:/home/denso# ping 10.10.10.2
PING 10.10.10.2 (10.10.10.2) 56(84) bytes of data.
From 10.10.10.1 icmp_seq=1 Destination Host Unreachable
From 10.10.10.1 icmp_seq=2 Destination Host Unreachable
From 10.10.10.1 icmp_seq=3 Destination Host Unreachable
From 10.10.10.1 icmp_seq=4 Destination Host Unreachable
From 10.10.10.1 icmp_seq=5 Destination Host Unreachable
From 10.10.10.1 icmp_seq=6 Destination Host Unreachable
^C
--- 10.10.10.2 ping statistics ---
7 packets transmitted, 0 received, +6 errors, 100% packet loss, time 6144ms
pipe 4
root@tegra-ubuntu:/home/denso#

Wireshark snapshot:

Hi @WayneWWW

I borrowed a AGX Orin dev kit and replaced the AGX Xavier RC side with it. Then I have a comparison test, you can refer the test set-2 in the test cases below.

Summary:

  1. The ethernet communication between EP and RC is workable in test set-2
  2. The bitrate from RC to EP is less than 4Gbits/s
  3. The bitrate from EP to RC is less than 7Gbits/s

Question:

  1. Why the result is different between AGX Xavier and AGX Orin with the same JP 5.1?
  2. The throughput seems not to meet PCIe Gen.3 x8(Less than 10%), even though I set the power mode to MAXN.

Test Cases:
PCIe_InterCon_Compare

EP Side test log:

root@tegra-ubuntu:/sys/kernel/config/pci_ep# ifconfig eth1
eth1: flags=4098<BROADCAST,MULTICAST>  mtu 64512
        ether fa:a6:fa:3a:19:bf  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

root@tegra-ubuntu:/sys/kernel/config/pci_ep# ifconfig eth1 up
root@tegra-ubuntu:/sys/kernel/config/pci_ep# [  333.694349] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  333.694576] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  333.831660] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  333.831938] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  333.866746] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  333.867004] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  334.370709] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  334.370985] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  334.594560] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
[  334.594821] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!

root@tegra-ubuntu:/sys/kernel/config/pci_ep# ifconfig eth1
eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 64512
        inet 10.10.10.2  netmask 255.255.255.0  broadcast 10.10.10.255
        inet6 fe80::f8a6:faff:fe3a:19bf  prefixlen 64  scopeid 0x20<link>
        ether fa:a6:fa:3a:19:bf  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

root@tegra-ubuntu:/sys/kernel/config/pci_ep# ping 10.10.10.1
PING 10.10.10.1 (10.10.10.1) 56(84) bytes of data.
64 bytes from 10.10.10.1: icmp_seq=1 ttl=64 time=0.597 ms
64 bytes from 10.10.10.1: icmp_seq=2 ttl=64 time=1.27 ms
64 bytes from 10.10.10.1: icmp_seq=3 ttl=64 time=1.27 ms
64 bytes from 10.10.10.1: icmp_seq=4 ttl=64 time=1.21 ms
^C
--- 10.10.10.1 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3005ms
rtt min/avg/max/mdev = 0.597/1.085/1.272/0.283 ms
root@tegra-ubuntu:/sys/kernel/config/pci_ep# iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 10.10.10.1, port 47622
[  5] local 10.10.10.2 port 5201 connected to 10.10.10.1 port 47624
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   418 MBytes  3.51 Gbits/sec                  
[  5]   1.00-2.00   sec   414 MBytes  3.47 Gbits/sec                  
[  5]   2.00-3.00   sec   412 MBytes  3.45 Gbits/sec                  
[  5]   3.00-4.00   sec   409 MBytes  3.43 Gbits/sec                  
[  5]   4.00-5.00   sec   408 MBytes  3.43 Gbits/sec                  
[  5]   5.00-6.00   sec   408 MBytes  3.42 Gbits/sec                  
[  5]   6.00-7.00   sec   411 MBytes  3.45 Gbits/sec                  
[  5]   7.00-8.00   sec   407 MBytes  3.41 Gbits/sec                  
[  5]   8.00-9.00   sec   406 MBytes  3.41 Gbits/sec                  
[  5]   9.00-10.00  sec   406 MBytes  3.41 Gbits/sec                  
[  5]  10.00-10.00  sec   831 KBytes  3.30 Gbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.00  sec  4.00 GBytes  3.44 Gbits/sec                  receiver
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
^Ciperf3: interrupt - the server has terminated
root@tegra-ubuntu:/sys/kernel/config/pci_ep# iperf3 -c 10.10.10.1
Connecting to host 10.10.10.1, port 5201
[  5] local 10.10.10.2 port 60002 connected to 10.10.10.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   782 MBytes  6.56 Gbits/sec    0   3.07 MBytes       
[  5]   1.00-2.00   sec   742 MBytes  6.23 Gbits/sec    0   3.07 MBytes       
[  5]   2.00-3.00   sec   739 MBytes  6.19 Gbits/sec    0   3.07 MBytes       
[  5]   3.00-4.00   sec   750 MBytes  6.30 Gbits/sec    0   3.07 MBytes       
[  5]   4.00-5.00   sec   756 MBytes  6.34 Gbits/sec    0   3.07 MBytes       
[  5]   5.00-6.00   sec   780 MBytes  6.54 Gbits/sec    0   3.07 MBytes       
[  5]   6.00-7.00   sec   791 MBytes  6.64 Gbits/sec    0   3.07 MBytes       
[  5]   7.00-8.00   sec   798 MBytes  6.69 Gbits/sec    0   3.07 MBytes       
[  5]   8.00-9.00   sec   784 MBytes  6.58 Gbits/sec    0   3.07 MBytes       
[  5]   9.00-10.00  sec   772 MBytes  6.48 Gbits/sec    0   3.07 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  7.51 GBytes  6.45 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  7.51 GBytes  6.45 Gbits/sec                  receiver

iperf Done.
root@tegra-ubuntu:/sys/kernel/config/pci_ep# 

RC Side test log:

root@tegra-ubuntu:/home/denso# ifconfig eth0
eth0: flags=4098<BROADCAST,MULTICAST>  mtu 64512
        ether f6:a1:f9:f6:21:a6  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

root@tegra-ubuntu:/home/denso# ifconfig eth0 up
root@tegra-ubuntu:/home/denso# ifconfig eth0
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 64512
        inet 10.10.10.1  netmask 255.255.255.0  broadcast 10.10.10.255
        inet6 fe80::f4a1:f9ff:fef6:21a6  prefixlen 64  scopeid 0x20<link>
        ether f6:a1:f9:f6:21:a6  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

root@tegra-ubuntu:/home/denso# ping 10.10.10.2
PING 10.10.10.2 (10.10.10.2) 56(84) bytes of data.
64 bytes from 10.10.10.2: icmp_seq=1 ttl=64 time=3.18 ms
64 bytes from 10.10.10.2: icmp_seq=2 ttl=64 time=1.66 ms
64 bytes from 10.10.10.2: icmp_seq=3 ttl=64 time=1.65 ms
64 bytes from 10.10.10.2: icmp_seq=4 ttl=64 time=1.60 ms
^C
--- 10.10.10.2 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3006ms
rtt min/avg/max/mdev = 1.599/2.023/3.180/0.668 ms
root@tegra-ubuntu:/home/denso# iperf3 -c 10.10.10.2
Connecting to host 10.10.10.2, port 5201
[  5] local 10.10.10.1 port 47624 connected to 10.10.10.2 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   418 MBytes  3.51 Gbits/sec    0   1.35 MBytes       
[  5]   1.00-2.00   sec   414 MBytes  3.47 Gbits/sec    0   1.35 MBytes       
[  5]   2.00-3.00   sec   412 MBytes  3.45 Gbits/sec    0   1.35 MBytes       
[  5]   3.00-4.00   sec   408 MBytes  3.43 Gbits/sec    0   1.35 MBytes       
[  5]   4.00-5.00   sec   409 MBytes  3.43 Gbits/sec    0   1.35 MBytes       
[  5]   5.00-6.00   sec   409 MBytes  3.42 Gbits/sec    0   2.09 MBytes       
[  5]   6.00-7.00   sec   410 MBytes  3.45 Gbits/sec    0   2.09 MBytes       
[  5]   7.00-8.00   sec   408 MBytes  3.41 Gbits/sec    0   2.09 MBytes       
[  5]   8.00-9.00   sec   406 MBytes  3.41 Gbits/sec    0   2.09 MBytes       
[  5]   9.00-10.00  sec   406 MBytes  3.41 Gbits/sec    0   2.09 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  4.00 GBytes  3.44 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  4.00 GBytes  3.44 Gbits/sec                  receiver

iperf Done.
root@tegra-ubuntu:/home/denso# iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 10.10.10.2, port 60000
[  5] local 10.10.10.1 port 5201 connected to 10.10.10.2 port 60002
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   774 MBytes  6.49 Gbits/sec                  
[  5]   1.00-2.00   sec   749 MBytes  6.29 Gbits/sec                  
[  5]   2.00-3.00   sec   738 MBytes  6.19 Gbits/sec                  
[  5]   3.00-4.00   sec   750 MBytes  6.29 Gbits/sec                  
[  5]   4.00-5.00   sec   756 MBytes  6.34 Gbits/sec                  
[  5]   5.00-6.00   sec   780 MBytes  6.54 Gbits/sec                  
[  5]   6.00-7.00   sec   791 MBytes  6.64 Gbits/sec                  
[  5]   7.00-8.00   sec   798 MBytes  6.69 Gbits/sec                  
[  5]   8.00-9.00   sec   785 MBytes  6.58 Gbits/sec                  
[  5]   9.00-10.00  sec   772 MBytes  6.48 Gbits/sec                  
[  5]  10.00-10.00  sec  1.50 MBytes  6.08 Gbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.00  sec  7.51 GBytes  6.45 Gbits/sec                  receiver
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
^Ciperf3: interrupt - the server has terminated

root@tegra-ubuntu:/home/denso# dmesg | grep -i pcie
[    7.591710] tegra194-pcie 14100000.pcie: Adding to iommu group 9
[    7.604338] tegra194-pcie 14100000.pcie: Using GICv2m MSI allocator
[    7.611451] tegra194-pcie 14160000.pcie: Adding to iommu group 10
[    7.623846] tegra194-pcie 14160000.pcie: Using GICv2m MSI allocator
[    7.630651] tegra194-pcie 141a0000.pcie: Adding to iommu group 11
[    7.642535] tegra194-pcie 141a0000.pcie: Using GICv2m MSI allocator
[    7.649500] tegra194-pcie 141a0000.pcie: Failed to get slot regulators: -517
[    9.929852] tegra194-pcie 14100000.pcie: Using GICv2m MSI allocator
[    9.944360] tegra194-pcie 14100000.pcie: host bridge /pcie@14100000 ranges:
[    9.951534] tegra194-pcie 14100000.pcie:       IO 0x0030100000..0x00301fffff -> 0x0030100000
[    9.963943] tegra194-pcie 14100000.pcie:      MEM 0x20a8000000..0x20afffffff -> 0x0040000000
[    9.980207] tegra194-pcie 14100000.pcie:      MEM 0x2080000000..0x20a7ffffff -> 0x2080000000
[   10.099975] tegra194-pcie 14100000.pcie: Link up
[   10.106675] tegra194-pcie 14100000.pcie: PCI host bridge to bus 0001:00
[   10.318073] pcieport 0001:00:00.0: Adding to iommu group 9
[   10.330202] pcieport 0001:00:00.0: PME: Signaling with IRQ 51
[   10.336442] pcieport 0001:00:00.0: AER: enabled with IRQ 51
[   10.343009] tegra194-pcie 14160000.pcie: Using GICv2m MSI allocator
[   10.350672] tegra194-pcie 14160000.pcie: host bridge /pcie@14160000 ranges:
[   10.364574] tegra194-pcie 14160000.pcie:       IO 0x0036100000..0x00361fffff -> 0x0036100000
[   10.373269] tegra194-pcie 14160000.pcie:      MEM 0x2428000000..0x242fffffff -> 0x0040000000
[   10.381970] tegra194-pcie 14160000.pcie:      MEM 0x2140000000..0x2427ffffff -> 0x2140000000
[   11.496264] tegra194-pcie 14160000.pcie: Phy link never came up
[   11.502439] tegra194-pcie 14160000.pcie: PCI host bridge to bus 0004:00
[   11.578175] pcieport 0004:00:00.0: Adding to iommu group 10
[   11.584176] pcieport 0004:00:00.0: PME: Signaling with IRQ 53
[   11.590456] pcieport 0004:00:00.0: AER: enabled with IRQ 53
[   11.615584] tegra194-pcie 141a0000.pcie: Using GICv2m MSI allocator
[   11.622152] tegra194-pcie 141a0000.pcie: Failed to get slot regulators: -517
[   11.622247] vdd-3v3-pcie: supplied by vdd-3v3-sys
[   11.681432] tegra194-pcie 141a0000.pcie: Using GICv2m MSI allocator
[   11.797188] tegra194-pcie 141a0000.pcie: host bridge /pcie@141a0000 ranges:
[   11.804375] tegra194-pcie 141a0000.pcie:       IO 0x003a100000..0x003a1fffff -> 0x003a100000
[   11.813055] tegra194-pcie 141a0000.pcie:      MEM 0x2b28000000..0x2b2fffffff -> 0x0040000000
[   11.821733] tegra194-pcie 141a0000.pcie:      MEM 0x2740000000..0x2b27ffffff -> 0x2740000000
[   11.936120] tegra194-pcie 141a0000.pcie: Link up
[   11.943017] tegra194-pcie 141a0000.pcie: PCI host bridge to bus 0005:00
[   12.036410] pci 0005:01:00.0: 63.008 Gb/s available PCIe bandwidth, limited by 8.0 GT/s PCIe x8 link at 0005:00:00.0 (capable of 126.024 Gb/s with 16.0 GT/s PCIe x8 link)
[   12.134372] pcieport 0005:00:00.0: Adding to iommu group 11
[   12.140347] pcieport 0005:00:00.0: PME: Signaling with IRQ 55
[   12.146620] pcieport 0005:00:00.0: AER: enabled with IRQ 55
root@tegra-ubuntu:/home/denso# 

For the performance issue, what is the lspci -vvv result on RC side?

And for the PCIe ethernet, could you share what is the diff between your orin and xavier?

I mean did they apply same patches or not?

Hi @WayneWWW

I apply the same patches both on Orin and Xavier. I mean that I used the same image built on my side, and just assigned different .conf when flash them.

Xavier flash command:

./flash jetson-agx-xavier-devkit mmcblk0p1

Orin flash command

./flash jetson-agx-orin-devkit mmcblk0p1

The lspci -vvv result on RC side:

root@tegra-ubuntu:/home/denso# lspci -vvv
0001:00:00.0 PCI bridge: NVIDIA Corporation Device 229e (rev a1) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 51
	Bus: primary=00, secondary=01, subordinate=ff, sec-latency=0
	I/O behind bridge: 00001000-00001fff [size=4K]
	Memory behind bridge: 40000000-400fffff [size=1M]
	Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff [disabled]
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
	BridgeCtl: Parity- SERR+ NoISA- VGA- VGA16- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Capabilities: [70] Express (v2) Root Port (Slot-), MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0
			ExtTag- RBE+
		DevCtl:	CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 128 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 16GT/s, Width x1, ASPM not supported
			ClockPM- Surprise+ LLActRep+ BwNot+ ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt+ AutBWInt-
		LnkSta:	Speed 2.5GT/s (downgraded), Width x1 (ok)
			TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt-
		RootCap: CRSVisible+
		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible+
		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
		DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, NROPrPrP+, LTR+
			 10BitTagComp+, 10BitTagReq-, OBFF Not Supported, ExtFmt-, EETLPPrefix-
			 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
			 FRS-, LN System CLS Not Supported, TPHComp-, ExtTPHComp-, ARIFwd+
			 AtomicOpsCap: Routing- 32bit- 64bit- 128bitCAS-
		DevCtl2: Completion Timeout: 1ms to 10ms, TimeoutDis-, LTR+, OBFF Disabled ARIFwd-
			 AtomicOpsCtl: ReqEn- EgressBlck-
		LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [b0] MSI-X: Enable- Count=1 Masked-
		Vector table: BAR=0 offset=00000000
		PBA: BAR=0 offset=00000000
	Capabilities: [100 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
		AERCap:	First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
			MultHdrRecCap+ MultHdrRecEn- TLPPfxPres- HdrLogCap-
		HeaderLog: 00000000 00000000 00000000 00000000
		RootCmd: CERptEn+ NFERptEn+ FERptEn+
		RootSta: CERcvd- MultCERcvd- UERcvd- MultUERcvd-
			 FirstFatal- NonFatalMsg- FatalMsg- IntMsg 0
		ErrorSrc: ERR_COR: 0000 ERR_FATAL/NONFATAL: 0000
	Capabilities: [148 v1] Secondary PCI Express
		LnkCtl3: LnkEquIntrruptEn-, PerformEqu-
		LaneErrStat: 0
	Capabilities: [158 v1] Physical Layer 16.0 GT/s <?>
	Capabilities: [17c v1] Lane Margining at the Receiver <?>
	Capabilities: [190 v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2- ASPM_L1.1- L1_PM_Substates+
			  PortCommonModeRestoreTime=60us PortTPowerOnTime=40us
		L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
			   T_CommonMode=10us
		L1SubCtl2: T_PwrOn=10us
	Capabilities: [1a0 v1] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?>
	Capabilities: [2a0 v1] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
	Capabilities: [2d8 v1] Data Link Feature <?>
	Capabilities: [2e4 v1] Precision Time Measurement
		PTMCap: Requester:- Responder:+ Root:+
		PTMClockGranularity: 16ns
		PTMControl: Enabled:- RootSelected:-
		PTMEffectiveGranularity: Unknown
	Capabilities: [2f0 v1] Vendor Specific Information: ID=0004 Rev=1 Len=054 <?>
	Capabilities: [358 v1] Vendor Specific Information: ID=0006 Rev=0 Len=018 <?>
	Kernel driver in use: pcieport

0001:01:00.0 Network controller: Realtek Semiconductor Co., Ltd. RTL8822CE 802.11ac PCIe Wireless Network Adapter
	Subsystem: AzureWave RTL8822CE 802.11ac PCIe Wireless Network Adapter
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 32 bytes
	Interrupt: pin A routed to IRQ 313
	Region 0: I/O ports at 1000 [size=256]
	Region 2: Memory at 20a8000000 (64-bit, non-prefetchable) [size=64K]
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
		Address: 000000000f410040  Data: 0262
	Capabilities: [70] Express (v2) Endpoint, MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <4us, L1 <64us
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
		DevCtl:	CorrErr- NonFatalErr- FatalErr- UnsupReq-
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <4us, L1 <64us
			ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk-
			ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s (ok), Width x1 (ok)
			TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Not Supported, TimeoutDis+, NROPrPrP-, LTR+
			 10BitTagComp-, 10BitTagReq-, OBFF Via message/WAKE#, ExtFmt-, EETLPPrefix-
			 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
			 FRS-, TPHComp-, ExtTPHComp-
			 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis+, LTR+, OBFF Disabled
			 AtomicOpsCtl: ReqEn-
		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [100 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
		AERCap:	First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
			MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
		HeaderLog: 00000000 00000000 00000000 00000000
	Capabilities: [148 v1] Device Serial Number 00-e0-4c-ff-fe-c8-22-01
	Capabilities: [158 v1] Latency Tolerance Reporting
		Max snoop latency: 0ns
		Max no snoop latency: 0ns
	Capabilities: [160 v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
			  PortCommonModeRestoreTime=30us PortTPowerOnTime=60us
		L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
			   T_CommonMode=0us LTR1.2_Threshold=0ns
		L1SubCtl2: T_PwrOn=10us
	Kernel driver in use: rtl88x2ce
	Kernel modules: rtl8822ce

0005:00:00.0 PCI bridge: NVIDIA Corporation Device 229a (rev a1) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 55
	Bus: primary=00, secondary=01, subordinate=ff, sec-latency=0
	I/O behind bridge: 0000f000-00000fff [disabled]
	Memory behind bridge: 40000000-405fffff [size=6M]
	Prefetchable memory behind bridge: 0000002740000000-00000027400fffff [size=1M]
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
	BridgeCtl: Parity- SERR+ NoISA- VGA- VGA16- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
		Address: 0000000000000000  Data: 0000
		Masking: 00000000  Pending: 00000000
	Capabilities: [70] Express (v2) Root Port (Slot-), MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0
			ExtTag- RBE+
		DevCtl:	CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 256 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 16GT/s, Width x8, ASPM not supported
			ClockPM- Surprise+ LLActRep+ BwNot+ ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt+ AutBWInt-
		LnkSta:	Speed 8GT/s (downgraded), Width x8 (ok)
			TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt-
		RootCap: CRSVisible+
		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible+
		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
		DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, NROPrPrP+, LTR+
			 10BitTagComp+, 10BitTagReq-, OBFF Not Supported, ExtFmt-, EETLPPrefix-
			 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
			 FRS-, LN System CLS Not Supported, TPHComp-, ExtTPHComp-, ARIFwd+
			 AtomicOpsCap: Routing- 32bit- 64bit- 128bitCAS-
		DevCtl2: Completion Timeout: 1ms to 10ms, TimeoutDis-, LTR+, OBFF Disabled ARIFwd-
			 AtomicOpsCtl: ReqEn- EgressBlck-
		LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+
			 EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
	Capabilities: [b0] MSI-X: Enable- Count=8 Masked-
		Vector table: BAR=2 offset=00000000
		PBA: BAR=2 offset=00010000
	Capabilities: [100 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
		AERCap:	First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
			MultHdrRecCap+ MultHdrRecEn- TLPPfxPres- HdrLogCap-
		HeaderLog: 00000000 00000000 00000000 00000000
		RootCmd: CERptEn+ NFERptEn+ FERptEn+
		RootSta: CERcvd- MultCERcvd- UERcvd- MultUERcvd-
			 FirstFatal- NonFatalMsg- FatalMsg- IntMsg 0
		ErrorSrc: ERR_COR: 0000 ERR_FATAL/NONFATAL: 0000
	Capabilities: [148 v1] Secondary PCI Express
		LnkCtl3: LnkEquIntrruptEn-, PerformEqu-
		LaneErrStat: 0
	Capabilities: [168 v1] Physical Layer 16.0 GT/s <?>
	Capabilities: [190 v1] Lane Margining at the Receiver <?>
	Capabilities: [1c0 v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2- ASPM_L1.1- L1_PM_Substates+
			  PortCommonModeRestoreTime=60us PortTPowerOnTime=40us
		L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
			   T_CommonMode=60us
		L1SubCtl2: T_PwrOn=10us
	Capabilities: [1d0 v1] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?>
	Capabilities: [2d0 v1] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
	Capabilities: [308 v1] Data Link Feature <?>
	Capabilities: [314 v1] Precision Time Measurement
		PTMCap: Requester:+ Responder:+ Root:+
		PTMClockGranularity: 16ns
		PTMControl: Enabled:- RootSelected:-
		PTMEffectiveGranularity: Unknown
	Capabilities: [320 v1] Vendor Specific Information: ID=0004 Rev=1 Len=054 <?>
	Capabilities: [388 v1] Vendor Specific Information: ID=0006 Rev=0 Len=018 <?>
	Kernel driver in use: pcieport

0005:01:00.0 Network controller: NVIDIA Corporation Device 2296
	Subsystem: NVIDIA Corporation Device 0000
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 55
	Region 0: Memory at 2b28000000 (32-bit, non-prefetchable) [size=4M]
	Region 2: Memory at 2740000000 (64-bit, prefetchable) [size=128K]
	Region 4: Memory at 2b28400000 (64-bit, non-prefetchable) [size=1M]
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [50] MSI: Enable- Count=1/16 Maskable+ 64bit+
		Address: 0000000000000000  Data: 0000
		Masking: 00000000  Pending: 00000000
	Capabilities: [70] Express (v2) Endpoint, MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
		DevCtl:	CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 256 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 16GT/s, Width x8, ASPM not supported
			ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk-
			ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 8GT/s (downgraded), Width x8 (ok)
			TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, NROPrPrP-, LTR+
			 10BitTagComp+, 10BitTagReq-, OBFF Not Supported, ExtFmt-, EETLPPrefix-
			 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
			 FRS-, TPHComp-, ExtTPHComp-
			 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled
			 AtomicOpsCtl: ReqEn-
		LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+
			 EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
	Capabilities: [b0] MSI-X: Enable+ Count=8 Masked-
		Vector table: BAR=2 offset=00000000
		PBA: BAR=2 offset=00010000
	Capabilities: [100 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
		AERCap:	First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
			MultHdrRecCap+ MultHdrRecEn- TLPPfxPres- HdrLogCap-
		HeaderLog: 00000000 00000000 00000000 00000000
	Capabilities: [148 v1] Secondary PCI Express
		LnkCtl3: LnkEquIntrruptEn-, PerformEqu-
		LaneErrStat: 0
	Capabilities: [168 v1] Physical Layer 16.0 GT/s <?>
	Capabilities: [190 v1] Lane Margining at the Receiver <?>
	Capabilities: [1b8 v1] Latency Tolerance Reporting
		Max snoop latency: 0ns
		Max no snoop latency: 0ns
	Capabilities: [1c0 v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2- ASPM_L1.1- L1_PM_Substates+
			  PortCommonModeRestoreTime=60us PortTPowerOnTime=40us
		L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
			   T_CommonMode=0us
		L1SubCtl2: T_PwrOn=10us
	Capabilities: [1d0 v1] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?>
	Capabilities: [2d0 v1] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
	Capabilities: [308 v1] Data Link Feature <?>
	Capabilities: [314 v1] Precision Time Measurement
		PTMCap: Requester:+ Responder:- Root:-
		PTMClockGranularity: Unimplemented
		PTMControl: Enabled:- RootSelected:-
		PTMEffectiveGranularity: Unknown
	Capabilities: [320 v1] Vendor Specific Information: ID=0003 Rev=1 Len=054 <?>
	Kernel driver in use: tvnet

Hi @WayneWWW

I found the root cause for this issue, the PCIe vnet driver has not assigned the max_mtu when it probe/bind. You can refer to the patches below.
But, the transmission performance is still not good, especially from the RC side to the EP side. Is it the limitation of NV’s PCIE vnet driver?

Performance Test EP side:

root@tegra-ubuntu:/sys/kernel/config/pci_ep# iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 10.10.10.1, port 38200
[  5] local 10.10.10.2 port 5201 connected to 10.10.10.1 port 38202
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   478 MBytes  4.01 Gbits/sec                  
[  5]   1.00-2.00   sec   491 MBytes  4.12 Gbits/sec                  
[  5]   2.00-3.00   sec   502 MBytes  4.21 Gbits/sec                  
[  5]   3.00-4.00   sec   524 MBytes  4.39 Gbits/sec                  
[  5]   4.00-5.00   sec   542 MBytes  4.54 Gbits/sec                  
[  5]   5.00-6.00   sec   547 MBytes  4.59 Gbits/sec                  
[  5]   6.00-7.00   sec   550 MBytes  4.61 Gbits/sec                  
[  5]   7.00-8.00   sec   548 MBytes  4.60 Gbits/sec                  
[  5]   8.00-9.00   sec   547 MBytes  4.59 Gbits/sec                  
[  5]   9.00-10.00  sec   547 MBytes  4.59 Gbits/sec                  
[  5]  10.00-10.00  sec  1.00 MBytes  4.20 Gbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.00  sec  5.15 GBytes  4.43 Gbits/sec                  receiver
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 10.10.10.1, port 38210
[  5] local 10.10.10.2 port 5201 connected to 10.10.10.1 port 38212
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   272 MBytes  2.28 Gbits/sec                  
[  5]   1.00-2.00   sec   274 MBytes  2.30 Gbits/sec                  
[  5]   2.00-3.00   sec   518 MBytes  4.35 Gbits/sec                  
[  5]   3.00-4.00   sec   543 MBytes  4.56 Gbits/sec                  
[  5]   4.00-5.00   sec   544 MBytes  4.57 Gbits/sec                  
[  5]   5.00-6.00   sec   544 MBytes  4.56 Gbits/sec                  
[  5]   6.00-7.00   sec   543 MBytes  4.56 Gbits/sec                  
[  5]   7.00-8.00   sec   543 MBytes  4.55 Gbits/sec                  
[  5]   8.00-9.00   sec   544 MBytes  4.56 Gbits/sec                  
[  5]   9.00-10.00  sec   545 MBytes  4.57 Gbits/sec                  
[  5]  10.00-10.00  sec   896 KBytes  4.33 Gbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.00  sec  4.76 GBytes  4.09 Gbits/sec                  receiver
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
^Ciperf3: interrupt - the server has terminated
root@tegra-ubuntu:/sys/kernel/config/pci_ep# iperf3 -c 10.10.10.1
Connecting to host 10.10.10.1, port 5201
[  5] local 10.10.10.2 port 44478 connected to 10.10.10.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   761 MBytes  6.38 Gbits/sec    0   3.07 MBytes       
[  5]   1.00-2.00   sec   785 MBytes  6.58 Gbits/sec    0   3.07 MBytes       
[  5]   2.00-3.00   sec   792 MBytes  6.64 Gbits/sec    0   3.07 MBytes       
[  5]   3.00-4.00   sec   794 MBytes  6.66 Gbits/sec    0   3.07 MBytes       
[  5]   4.00-5.00   sec   794 MBytes  6.66 Gbits/sec    0   3.07 MBytes       
[  5]   5.00-6.00   sec   798 MBytes  6.69 Gbits/sec    0   3.07 MBytes       
[  5]   6.00-7.00   sec   802 MBytes  6.73 Gbits/sec    0   3.07 MBytes       
[  5]   7.00-8.00   sec   804 MBytes  6.74 Gbits/sec    0   3.07 MBytes       
[  5]   8.00-9.00   sec   804 MBytes  6.74 Gbits/sec    0   3.07 MBytes       
[  5]   9.00-10.00  sec   805 MBytes  6.75 Gbits/sec    0   3.07 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  7.75 GBytes  6.66 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  7.75 GBytes  6.66 Gbits/sec                  receiver

iperf Done.
root@tegra-ubuntu:/sys/kernel/config/pci_ep# iperf3 -c 10.10.10.1
Connecting to host 10.10.10.1, port 5201
[  5] local 10.10.10.2 port 44512 connected to 10.10.10.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   783 MBytes  6.57 Gbits/sec    0   3.14 MBytes       
[  5]   1.00-2.00   sec   801 MBytes  6.72 Gbits/sec    0   3.14 MBytes       
[  5]   2.00-3.00   sec   804 MBytes  6.74 Gbits/sec    0   3.14 MBytes       
[  5]   3.00-4.00   sec   796 MBytes  6.67 Gbits/sec    0   3.14 MBytes       
[  5]   4.00-5.00   sec   798 MBytes  6.70 Gbits/sec    0   3.14 MBytes       
[  5]   5.00-6.00   sec   801 MBytes  6.71 Gbits/sec    0   3.14 MBytes       
[  5]   6.00-7.00   sec   799 MBytes  6.71 Gbits/sec    0   3.14 MBytes       
[  5]   7.00-8.00   sec   800 MBytes  6.71 Gbits/sec    0   3.14 MBytes       
[  5]   8.00-9.00   sec   799 MBytes  6.70 Gbits/sec    0   3.14 MBytes       
[  5]   9.00-10.00  sec   799 MBytes  6.70 Gbits/sec    0   3.14 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  7.79 GBytes  6.69 Gbits/sec    0             sender
[  5]   0.00-10.01  sec  7.79 GBytes  6.69 Gbits/sec                  receiver

iperf Done.

Performance Test RC side:

root@tegra-ubuntu:/home/denso# iperf3 -c 10.10.10.2
Connecting to host 10.10.10.2, port 5201
[  5] local 10.10.10.1 port 38202 connected to 10.10.10.2 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   479 MBytes  4.01 Gbits/sec    0   1.23 MBytes       
[  5]   1.00-2.00   sec   490 MBytes  4.12 Gbits/sec    0   1.29 MBytes       
[  5]   2.00-3.00   sec   502 MBytes  4.21 Gbits/sec    0   1.29 MBytes       
[  5]   3.00-4.00   sec   524 MBytes  4.39 Gbits/sec    0   2.03 MBytes       
[  5]   4.00-5.00   sec   541 MBytes  4.54 Gbits/sec    0   2.03 MBytes       
[  5]   5.00-6.00   sec   548 MBytes  4.59 Gbits/sec    0   2.03 MBytes       
[  5]   6.00-7.00   sec   550 MBytes  4.61 Gbits/sec    0   2.03 MBytes       
[  5]   7.00-8.00   sec   548 MBytes  4.60 Gbits/sec    0   2.03 MBytes       
[  5]   8.00-9.00   sec   548 MBytes  4.59 Gbits/sec    0   2.03 MBytes       
[  5]   9.00-10.00  sec   548 MBytes  4.59 Gbits/sec    0   2.03 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  5.15 GBytes  4.43 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  5.15 GBytes  4.43 Gbits/sec                  receiver

iperf Done.
root@tegra-ubuntu:/home/denso# iperf3 -c 10.10.10.2
Connecting to host 10.10.10.2, port 5201
[  5] local 10.10.10.1 port 38212 connected to 10.10.10.2 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   272 MBytes  2.28 Gbits/sec    0   1.35 MBytes       
[  5]   1.00-2.00   sec   275 MBytes  2.30 Gbits/sec    0   1.35 MBytes       
[  5]   2.00-3.00   sec   519 MBytes  4.36 Gbits/sec    0   1.35 MBytes       
[  5]   3.00-4.00   sec   542 MBytes  4.56 Gbits/sec    0   1.35 MBytes       
[  5]   4.00-5.00   sec   545 MBytes  4.57 Gbits/sec    0   1.35 MBytes       
[  5]   5.00-6.00   sec   542 MBytes  4.56 Gbits/sec    0   1.35 MBytes       
[  5]   6.00-7.00   sec   544 MBytes  4.56 Gbits/sec    0   1.35 MBytes       
[  5]   7.00-8.00   sec   542 MBytes  4.55 Gbits/sec    0   1.35 MBytes       
[  5]   8.00-9.00   sec   544 MBytes  4.56 Gbits/sec    0   1.35 MBytes       
[  5]   9.00-10.00  sec   546 MBytes  4.57 Gbits/sec    0   1.48 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  4.76 GBytes  4.09 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  4.76 GBytes  4.09 Gbits/sec                  receiver

iperf Done.
root@tegra-ubuntu:/home/denso# iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 10.10.10.2, port 44476
[  5] local 10.10.10.1 port 5201 connected to 10.10.10.2 port 44478
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   756 MBytes  6.34 Gbits/sec                  
[  5]   1.00-2.00   sec   785 MBytes  6.59 Gbits/sec                  
[  5]   2.00-3.00   sec   792 MBytes  6.64 Gbits/sec                  
[  5]   3.00-4.00   sec   794 MBytes  6.66 Gbits/sec                  
[  5]   4.00-5.00   sec   793 MBytes  6.66 Gbits/sec                  
[  5]   5.00-6.00   sec   798 MBytes  6.69 Gbits/sec                  
[  5]   6.00-7.00   sec   803 MBytes  6.74 Gbits/sec                  
[  5]   7.00-8.00   sec   804 MBytes  6.74 Gbits/sec                  
[  5]   8.00-9.00   sec   804 MBytes  6.74 Gbits/sec                  
[  5]   9.00-10.00  sec   805 MBytes  6.76 Gbits/sec                  
[  5]  10.00-10.00  sec  3.90 MBytes  8.10 Gbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.00  sec  7.75 GBytes  6.66 Gbits/sec                  receiver
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 10.10.10.2, port 44510
[  5] local 10.10.10.1 port 5201 connected to 10.10.10.2 port 44512
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   779 MBytes  6.53 Gbits/sec                  
[  5]   1.00-2.00   sec   801 MBytes  6.72 Gbits/sec                  
[  5]   2.00-3.00   sec   804 MBytes  6.74 Gbits/sec                  
[  5]   3.00-4.00   sec   796 MBytes  6.67 Gbits/sec                  
[  5]   4.00-5.00   sec   799 MBytes  6.70 Gbits/sec                  
[  5]   5.00-6.00   sec   800 MBytes  6.71 Gbits/sec                  
[  5]   6.00-7.00   sec   799 MBytes  6.71 Gbits/sec                  
[  5]   7.00-8.00   sec   799 MBytes  6.71 Gbits/sec                  
[  5]   8.00-9.00   sec   799 MBytes  6.71 Gbits/sec                  
[  5]   9.00-10.00  sec   798 MBytes  6.70 Gbits/sec                  
[  5]  10.00-10.01  sec  4.53 MBytes  7.46 Gbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.01  sec  7.79 GBytes  6.69 Gbits/sec                  receiver
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
^Ciperf3: interrupt - the server has terminated

PCIe vnet driver patches:

diff --git a/drivers/net/ethernet/nvidia/pcie/tegra_vnet.c b/drivers/net/ethernet/nvidia/pcie/tegra_vnet.c
index d4512ce9a..eb5d7c861 100644
--- a/drivers/net/ethernet/nvidia/pcie/tegra_vnet.c
+++ b/drivers/net/ethernet/nvidia/pcie/tegra_vnet.c
@@ -803,6 +803,7 @@ static int tvnet_host_probe(struct pci_dev *pdev,
        netif_napi_add(ndev, &tvnet->napi, tvnet_host_poll, TVNET_NAPI_WEIGHT);
 
        ndev->mtu = TVNET_DEFAULT_MTU;
+       ndev->max_mtu = TVNET_MAX_MTU;
 
        ret = register_netdev(ndev);
        if (ret) {
diff --git a/drivers/pci/endpoint/functions/pci-epf-tegra-vnet.c b/drivers/pci/endpoint/functions/pci-epf-tegra-vnet.c
index 834081439..5c5890540 100644
--- a/drivers/pci/endpoint/functions/pci-epf-tegra-vnet.c
+++ b/drivers/pci/endpoint/functions/pci-epf-tegra-vnet.c
@@ -1518,6 +1518,7 @@ static int tvnet_ep_pci_epf_bind(struct pci_epf *epf)
        netif_napi_add(ndev, &tvnet->napi, tvnet_ep_poll, TVNET_NAPI_WEIGHT);
 
        ndev->mtu = TVNET_DEFAULT_MTU;
+       ndev->max_mtu = TVNET_MAX_MTU;
 
        ret = register_netdev(ndev);
        if (ret < 0) {

Are you sure this issue is related to that max mtu setting? I really doubt about it…