X86 CPU to Xavier AGX (in endpoint mode) with PCIe : How enable Ethernet over PCIe driver

RokiaDiarr · May 13, 2020, 12:07pm

Hi,

I have successfully connected our two Xavier AGX dev kit with a PCIe x16 cable and test the “Ethernet over PCIe drivers” by following the steps provided in Welcome — Jetson Linux<br/>Developer Guide 34.1 documentation

I use the JetPack 4.3 release and applied the patch to get the 5Gbs bandwidth.

Now, I want to replace one Xavier by a x86 CPU. The x86 CPU is the RootComplex and the Xavier is already flashed in EndPoint mode.

How Can I make the “Ethernet over PCIe drivers” work on my x86 CPU as it is the case currently when I connect two Xaviers?

Without any other configuration, when I boot my x86 CPU, “lspci” does not list the Xavier.

Please I need help.
Thanks.

vidyas · May 14, 2020, 5:46am

Please keep AGX (EP) booted first along with the commands run to keep it ready.
Also compile its device driver ( tegra_vnet.c ) on x86. With this, you should be able to replicate it on x86 as well.

RokiaDiarr · May 18, 2020, 10:53am

Thanks for your response
I compiled the “tegra_vnet.c” driver and get it worked on my x86. Xavier AGX is correctly identified by the x86. “lspci” output is below:

b3:00.0 Network controller: NVIDIA Corporation Device 2296
Subsystem: NVIDIA Corporation Device 0000
Flags: bus master, fast devsel, latency 0, IRQ 96, NUMA node 0
Memory at fb800000 (32-bit, non-prefetchable) [size=4M]
Memory at fffff00000 (64-bit, prefetchable) [size=128K]
Memory at fbc00000 (64-bit, non-prefetchable) [size=1M]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/16 Maskable+ 64bit-
Capabilities: [70] Express Endpoint, MSI 00
Capabilities: [b0] MSI-X: Enable+ Count=8 Masked-
Capabilities: [100] Advanced Error Reporting
Capabilities: [148] #19
Capabilities: [168] #26
Capabilities: [190] #27
Capabilities: [1b8] Latency Tolerance Reporting
Capabilities: [1c0] L1 PM Substates
Capabilities: [1d0] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?> Capabilities: [2d0] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
Capabilities: [308] #25
Capabilities: [314] Precision Time Measurement
Capabilities: [320] Vendor Specific Information: ID=0003 Rev=1 Len=054 <?>
Kernel driver in use: tvnet

However, after all configuration, I run iperf3 on both and I got a “Unable to connect. No route to host” error message from the client (Xavier AGX) side. I cant’ ping the x86 (192.168.2.2) from AGX (192.168.2.1). I tries the bind option of iperf, but the connectivity issue remains.

I notice that on x86, Ethernet interface logical name is “eth0” instead of “eth1” (as I got on Xavier"

Thanks

linuxdev · May 18, 2020, 8:02pm

On each end of the connection (on each host), do you now see something related to those addresses via “ifconfig” and “route” commands? I’ve never used the PCIe endpoint like this, but if ping or other network commands are used, then it makes sense that each end would also have to do normal network setup.

RokiaDiarr · May 19, 2020, 10:24am

Yes I see those addresses via “ifconfig” on each host. Below the output of “ifconfig” and “route” on both hosts.

On RootComplex (x86):

ifconfig

eno1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.0.14 netmask 255.255.255.0 broadcast 192.168.0.255
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 64512
inet 192.168.0.31 netmask 255.255.255.0 broadcast 192.168.0.255

route

Table de routage IP du noyau
Destination Passerelle Genmask Indic Metric Ref Use Iface
0.0.0.0 192.168.0.1 0.0.0.0 UG 100 0 0 eno1
169.254.0.0 0.0.0.0 255.255.0.0 U 1000 0 0 eno1
192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
192.168.0.0 0.0.0.0 255.255.255.0 U 100 0 0 eno1

On EndPoint (AGX):

ifconfig

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.0.15 netmask 255.255.255.0 broadcast 192.168.0.255
eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 64512
inet 192.168.0.30 netmask 255.255.255.0 broadcast 192.168.0.255

route

Table de routage IP du noyau
Destination Passerelle Genmask Indic Metric Ref Use Iface
0.0.0.0 192.168.0.1 0.0.0.0 UG 100 0 0 eth0
169.254.0.0 0.0.0.0 255.255.0.0 U 1000 0 0 eth0
192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1
192.168.0.0 0.0.0.0 255.255.255.0 U 100 0 0 eth0

On each host, I can ping successfully its locals interfaces (for example: 192.168.0.14 and 192.168.0.31 for the x86). I also have no problem accessing the internet on each host. However, I can’t ping the AGX from the x_86 and vice versa. The firewall is disabled on the x_86.

linuxdev · May 19, 2020, 6:56pm

On the x86 host you have a routing table which will fail:

Destination  Passerelle  Genmask       Indic Metric Ref Use Iface
0.0.0.0      192.168.0.1 0.0.0.0       UG    100    0   0   eno1
169.254.0.0  0.0.0.0     255.255.0.0   U     1000   0   0   eno1
192.168.0.0  0.0.0.0     255.255.255.0 U     0      0   0   eth0
192.168.0.0  0.0.0.0     255.255.255.0 U     100    0   0   eno1

Notice that both eth0 and eno1 cover the entire subnet. The only difference is that eno1 will never get used unless eth0 fails. The “metric” describes priorities, and a higher metric implies a higher cost. eno0 will never see a packet (other than setting up DHCP and/or route) since it has a higher metric.

What I will suggest is that on the host you use something like “nm-connection-editor” (if you don’t have this, then “sudo apt-get install network-manager-gnome”) and set the PCIe link (eno1) to static IPv4 address 192.168.0.14, and netmask 255.255.255.254. Then manually set the metric to 0. You might find this reference URL helpful, especially information about metric:
https://docs.ubuntu.com/core/en/stacks/network/network-manager/docs/routing-tables

Another app which might help is “ifmetric” (“sudo apt-get install ifmetric”), although I tend to edit files (NetworkManager can make file editing seem to fail at times, so first do everything you can in “nm-connection-editor” before using other tools). The “route” command can use the “metric” argument to set a metric (see “man route”).

Example with the command line tool (part of network-manager):

nmcli connection modify eno1 ipv4.route-metric 0

(you might need to ifdown and ifup the interface, not sure)

Then run “route” again to see if the metric is 0. Having the eno1 cover only a single address, and making its metric equal to eth0, implies the more specific address will get the traffic (and it is more specific if the netmask is a “/31”, or “255.255.255.254”).

There is a similar issue with the Jetson. You have two networks with conflicting networks over the same address range. Only the address with the metric of “0” will get traffic. Setting the PCIe network to have a single address in the route (meaning netmask is “255.255.254” instead of “255.255.255.0”), and setting metric to 0, should make that route available for that address.

Normally what someone would do in a situation like this is to simply put the second NIC (which your PCIe is emulating) on a different subnet…then there would be no conflict in anyway. For example, if the PCIe had address “192.168.1.x” (note “.1.” instead of “.0.”…a different “/24” subnet), then metric and route changes would never matter…this would be the only device on that subnet and there would be no confusion as to which interface to route to.

Pretty much any online web search you perform for Ubuntu networking, or docs on subnets, would be the same for this. It won’t matter if the NIC is really PCIe or real or virtual USB…at this point it is about network setup, and not about the PCIe side.

omp · May 21, 2020, 10:39am

There is known issue for PCIe virtual network driver to work with x86 host. Please wait for further update on fix.

Thanks,
Om

juns · May 21, 2020, 5:05pm

Thanks for your kind notification.
May I know the plan of the fix? In Jetpack 4.4 GA or Jetpack 4.5/5

omp · May 29, 2020, 4:27pm

tvnet.zip (9.7 KB)

On x86 Host side please use attached EP driver.

Please note PCIe virtual Ethernet interface will have mtu size 64512. Interface name will be ethx

Follow below steps:

Boot AGX(EP) first and start function driver:
cd /sys/kernel/config/pci_ep/
mkdir functions/pci_epf_tvnet/func1
ln -s functions/pci_epf_tvnet/func1 controllers/141a0000.pcie_ep/
echo 1 > controllers/141a0000.pcie_ep/start
Boot X86 host. verify PCIe Virtual Ethernet EP has been detected.
lspci
…
xx Network controller: NVIDIA Corporation Device 2296
Compile and load attached driver on x86 Host
Set IP address on both EP and Host

RokiaDiarr · May 31, 2020, 10:30am

Thanks

I tested the driver and it works.

Topic		Replies	Views
How to test xavier which is in endpoint mode in JetPack-4.2 Jetson AGX Xavier	18	2473	October 18, 2021
How another CPU communicate with Xavier through PCIE? (Solved) Jetson AGX Xavier	30	8659	October 18, 2021
Use PCIe to communicate between two Xaviers: RP phy dosen't up Jetson AGX Xavier	17	2222	October 6, 2021
xavier pcie ethernet controller problem Jetson AGX Xavier	5	955	October 18, 2021
The bandwidth of of virtual ethernet over PCIe between two xaviers is low Jetson AGX Xavier	92	9925	October 18, 2021
AGX Xevier endpoint mode Ethernet over PCIE PCIe x1 x2 x4 x8 issue Jetson AGX Xavier pcie	4	503	August 11, 2022
Request a Application Dome of PCIe communication Jetson AGX Xavier pcie	12	1144	October 18, 2021
The ethernet interface over PCIe(endpoint mode) not work Jetson AGX Xavier pcie , nvbugs	59	4359	September 1, 2023
What is the actual maximum speed of Jetson AGX Xavier PCIE Ethernet? Jetson AGX Xavier pcie , ethernet	14	3758	May 5, 2022
Root-endpoint PCIe communication between Xaviers with a bridge in the way Jetson AGX Xavier pcie	10	1876	October 18, 2021

X86 CPU to Xavier AGX (in endpoint mode) with PCIe : How enable Ethernet over PCIe driver

Related topics