Jetson AGX Orin Endpoint Mode - nothing is working, HELP!

Hi all,
Currently I am trying to set up an AGX Orin as a PCIe endpoint to send data to a more powerful desktop machine.
Specifically (if it matters), I am interested in sending raw camera data over the PCI cable, as this theoretically allows for much higher throughput than USB or ethernet. Our cameras interact with the CSI-2 interface on the Orin, which is nice, but unfortunately is not a well-supported interface. Our thinking is that we can use the Orin as a bridge between a more traditional machine and the CSI-2 cameras.
I am having some trouble getting the PCI endpoint mode to work, however. I will recount the steps I’ve taken to try and get the endpoint mode to work:

First, I downloaded and extracted the files from this location: Jetson Linux Release 36.4.4 | NVIDIA Developer
Specifically, I downloaded the Driver Package and Sample Root Filesystem
I extracted the Driver Package, which produced the “Linux_for_Tegra” directory complete with scripts and files
I extracted the Sample Root Filesystem package to the “rootfs” directory
I ran the apply_binaries.sh script
I ran the flash.sh script as shown in the PCI endpoint guide after putting my Orin in forced recovery mode
I can confirm that the pci endpoint modules are present on my AGX orin, via

$ lsmod | grep pci
pci_epf_tegra_vnet 32768 1
tegra_pcie_dma_test 20480 0
tegra_pcie_edma 20480 1 tegra_pcie_dma_test
pcie_tegra194 40960 0

I connected the agx endpoint to my root port machine using an x16 PCIE cable with tx<->rx signal swap- this cable was suggested by a PDF I found describing the root/endpoint layout between two Jetson Xavier systems
(Jetson_AGX_Xavier_PCIe_Endpoint_Design_Guidelines.pdf)

I configured and enabled PCIe endpoint mode for the pci_epf_tvnet function as shown in the documentation. (mkdir functions/pci_epf_tvnet/func1 etc…)
On the host side I compiled the tegra_vnet kernel module- on reboot I modprobe it.
When I reboot my host system, I see the following output from tegra194-pcie on the endpoint:

[ 2503.615795] tegra194-pcie 141a0000.pcie-ep: iATU unroll: enabled
[ 2503.615808] tegra194-pcie 141a0000.pcie-ep: Detected iATU regions: 8 outbound, 2 inbound
[ 2504.604721] tegra194-pcie 141a0000.pcie-ep: iATU unroll: enabled
[ 2504.604728] tegra194-pcie 141a0000.pcie-ep: Detected iATU regions: 8 outbound, 2 inbound

However I do not see any activity from my endpoint function modules.
The only activity I do see is when I try to start the pci_epf_tvnet function while NetworkManager is running, in which case I see the following dmesg output (repeated a number of times):

[ 4822.261111] pci_epf_tvnet pci_epf_tvnet.0: tvnet_ep_open: PCIe link is not up

Disabling NetworkManager before starting the module eliminates these messages (but also disables internet, so… a bit awkward)
I suspected the endpoint mode might work better between two AGX orin machines, where one is root and the other is endpoint, as this is closer to the setup used in the documentation (though it implies any PC should work as root)
To confirm, I flashed another AGX orin machine, this time without the ODMDATA line in the jetson-agx-orin-devkit.conf, and hooked it to my endpoint Orin with the same PCI cable.
Unfortunately- no dice. The log message is slightly different this time for tegra194-pcie:

[ 235.268834] tegra194-pcie 141a0000.pcie-ep: iATU unroll: enabled
[ 235.268847] tegra194-pcie 141a0000.pcie-ep: Detected iATU regions: 8 outbound, 2 inbound
[ 249.119540] tegra194-pcie 141a0000.pcie-ep: LTSSM state: 0x2018 timeout: -110
[ 262.887676] tegra194-pcie 141a0000.pcie-ep: iATU unroll: enabled
[ 262.887689] tegra194-pcie 141a0000.pcie-ep: Detected iATU regions: 8 outbound, 2 inbound
[ 265.130463] tegra194-pcie 141a0000.pcie-ep: LTSSM state: 0x2018 timeout: -110

But nothing for pci_epf_tegra_vnet or tegra_vnet. Device not showing up with lspci.
I added tegra_vnet to /etc/modules-load.d/modules.conf on the root machine, and that did pop up a message:

[ 9.175955] tegra_vnet: loading out-of-tree module taints kernel

so… not sure what that means.

I’m quite a bit out of my depth here, and would appreciate any information/support/ideas people have to offer.

Cheers!

How did you flash your Orin AGX there? It looks like you didn’t mention that you enable the EP mode on Orin.

I did attempt to enable it- I inserted the appropriate ODMDATA variable in the .conf file. If I had somehow made a mistake there (even though it is copy and paste) is there some way to verify if EP mode is enabled on the jetson?

Do you have 2 Orin there to test as one RP and one EP?

Yes- after I failed to get it to interact with my desktop, I tried with another Orin. Still no dice, but I got some slightly different dmesg output; which I elaborated on in the original post.

please share your whole steps (commands) to test 2 Orin scenario.

List out as RP side and EP side.

Okay, I will attempt this and update you tomorrow, thanks.

Hi, so I just tried again. One thing I realized is that I was looking at the wrong documentation- I am on r36.4.4, and the documentation I was looking at was for r35.1- this is the one that shows up first on google. I am still having problems though. Here is what I have done:

On EP Orin:

I performed the following steps from the r36.4.4 documentation:

# modprobe pci-epf-dma-test
# cd /sys/kernel/config/pci_ep/
# mkdir functions/tegra_pcie_dma_epf/func1
# echo 0x10de > functions/tegra_pcie_dma_epf/func1/vendorid
# echo 0x229a > functions/tegra_pcie_dma_epf/func1/deviceid
# echo 16 > functions/tegra_pcie_dma_epf/func1/msi_interrupts
# ln -s functions/tegra_pcie_dma_epf/func1 controllers/${PCIE_EP_ADDR}.pcie-ep/
# echo 1 > controllers/${PCIE_EP_ADDR}.pcie-ep/start

On the RP side:

I added tegra_pcie_dma_test to /etc/modules-load.d/modules.conf, then rebooted the machine. This causes the following dmesg log outputs on the EP side:

[ 1693.615049] tegra194-pcie 141a0000.pcie-ep: iATU unroll: enabled
[ 1693.615063] tegra194-pcie 141a0000.pcie-ep: Detected iATU regions: 8 outbound, 2 inbound
[ 1693.615248] pcie_dma_epf tegra_pcie_dma_epf.0: BAR0 phy_addr: f0000000 size: 10000000
[ 1708.521868] tegra194-pcie 141a0000.pcie-ep: LTSSM state: 0x2018 timeout: -110
[ 1722.874316] tegra194-pcie 141a0000.pcie-ep: iATU unroll: enabled
[ 1722.874330] tegra194-pcie 141a0000.pcie-ep: Detected iATU regions: 8 outbound, 2 inbound
[ 1722.874424] pcie_dma_epf tegra_pcie_dma_epf.0: BAR0 phy_addr: f0000000 size: 10000000
[ 1725.126569] tegra194-pcie 141a0000.pcie-ep: LTSSM state: 0x2018 timeout: -110

And the following dmesg logs on the RP side:

[ 12.815505] tegra194-pcie 141a0000.pcie: Adding to iommu group 55
[ 12.924418] tegra194-pcie 141a0000.pcie: host bridge /bus@0/pcie@141a0000 ranges:
[ 12.924441] tegra194-pcie 141a0000.pcie: MEM 0x2800000000..0x2b27ffffff → 0x2800000000
[ 12.924447] tegra194-pcie 141a0000.pcie: MEM 0x2b28000000..0x2b2fffffff → 0x0040000000
[ 12.924451] tegra194-pcie 141a0000.pcie: IO 0x003a100000..0x003a1fffff → 0x003a100000
[ 12.924822] tegra194-pcie 141a0000.pcie: iATU unroll: enabled
[ 12.924824] tegra194-pcie 141a0000.pcie: Detected iATU regions: 8 outbound, 2 inbound
[ 14.031110] tegra194-pcie 141a0000.pcie: Phy link never came up
[ 15.027886] tegra194-pcie 141a0000.pcie: Phy link never came up
[ 15.028140] tegra194-pcie 141a0000.pcie: PCI host bridge to bus 0005:00
[ 15.028149] pci_bus 0005:00: root bus resource [io 0x200000-0x2fffff] (bus address [0x3a100000-0x3a1fffff])
[ 15.028156] pci_bus 0005:00: root bus resource [mem 0x2b28000000-0x2b2fffffff] (bus address [0x40000000-0x47ffffff])
[ 15.028164] pci_bus 0005:00: root bus resource [bus 00-ff]
[ 15.028167] pci_bus 0005:00: root bus resource [mem 0x2800000000-0x2b27ffffff pref]
[ 15.028242] pci 0005:00:00.0: [10de:229a] type 01 class 0x060400
[ 15.028450] pci 0005:00:00.0: PME# supported from D0 D3hot
[ 15.045465] pci 0005:00:00.0: PCI bridge to [bus 01-ff]
[ 15.046182] pcieport 0005:00:00.0: Adding to iommu group 55
[ 15.046571] pcieport 0005:00:00.0: PME: Signaling with IRQ 204
[ 15.048336] pcieport 0005:00:00.0: AER: enabled with IRQ 204
[ 15.049984] pci_bus 0005:01: busn_res: [bus 01-ff] is released
[ 15.051728] pci 0005:00:00.0: Removing from iommu group 55
[ 15.051785] pci_bus 0005:00: busn_res: [bus 00-ff] is released

So… it seems the PCI device is being found but something goes wrong during link up.

On a whim, I again attempted to plug the endpoint into my desktop instead of the jetson. Like in my original post, this resulted in a different log from the endpoint side:

[ 4488.519197] tegra194-pcie 141a0000.pcie-ep: iATU unroll: enabled
[ 4488.519211] tegra194-pcie 141a0000.pcie-ep: Detected iATU regions: 8 outbound, 2 inbound
[ 4488.519305] pcie_dma_epf tegra_pcie_dma_epf.0: BAR0 phy_addr: f0000000 size: 10000000

Interestingly, the LTSSM timeout is gone- but unfortunately no trace of anything on the root port (my desktop)’s side.

What do you think might be the issue?

EDIT: I was using a pretty long cable- I saw some advice online that suggested long cables may result in that “phy link never came up” messages- so I tested with a shorter cable. Same result, unfortunately.

Hi everyone,

Turns out there was a mistake in ordering the cable- we got the direct extension cable, not the one with the tx<>rx signal swap. We are going to order the new cable and see how that does. For now this is resolved, I will potentially come back once we have the new cable if there are still issues.

Thanks @WayneWWW for your assistance!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.