Jetpack5.0.2 Xavier pcie endpoint mode

I used jetpack 5.0.2 to test Xavier’s pcie endpoint mode. After following the steps in the documentation, the network does not communicate properly and Xavier on the endpoint side gets stuck and restarts.


So you have followed the instruction to " Flashing the PCIe Endpoint on a Jetson AGX Xavier Series System" , and still got same result?

Yes, I followed the instructions
I use jetpack 32.4.4 and follow the steps without problems and the pcie works fine.

Could you share the whole steps you run? There is just another user validated the steps 1 week ago so I don’t think a crash would happen.

  1. First I have two Xavier devices, as root mode and endpoint mode respectively.
    2, then follow the steps below to flash two Xavier devices respectively
    0x09190000 for root
    0x09190000 for endpoint
    图片

3、After the flash is complete,follow the steps, first start the endpoint mode Xavier and execute the following command.

# cd /sys/kernel/config/pci_ep/
# mkdir functions/pci_epf_tvnet/func1
# echo 16 > functions/pci_epf_tvnet/func1/msi_interrupts
# ln -s functions/pci_epf_tvnet/func1 controllers/141a0000.pcie_ep/
# echo 1 > controllers/141a0000.pcie_ep/start

Then start root mode Xavier



        On the endpoint device: ifconfig eth1 up

        On the root port system: ifconfig eth1 up

        On the endpoint device: ifconfig eth1 192.168.2.1

        On the root port system: ifconfig eth1 192.168.2.2


You can see that the two Xavier devices have eth1 devices, but the ping command is not working.

4、After waiting for about 10 minutes, the endpoint mode Xavier will report an error and reboot.
图片

Is this a typo or you really set this value?

 then follow the steps below to flash two Xavier devices respectively
0x09190000 for root
0x09190000 for endpoint

Sorry, I entered it wrong。

0x09190000 for root
0x09191000 for endpoint

I performed the test this way and the test steps were fine

My test steps are fine, have you actually tested with two Xavier

請問一下你的"test steps were fine" 是在說"你覺得你的步驟沒有錯"? 還是你在說現在結果是ok的?
我們已經測試過兩個Xavier devkit對接的狀況了 lspci也是可以偵測到EP device的.
請問一下你那邊的錯誤是lspci就偵測不到,還是從互相ping才開始crash?

Could you clarify if your “tests steps were fine” are saying that “your steps are correct” or you are talking about “the result is ok”?

We already tested two xavier devkit cases and the lspci on RP device is able to detect another EP xavier device.
Does your case even not detect the EP in lspci? or the crash starts only when you ping each other?

你好,我的测试步骤是ok的,测试结果是有问题的。
先启动EP device然后启动root device,在root device端通过lspci命令可以看到EP device,也可以生成eth1设备节点。
手动配置ip地址后,ping命令不通,然后几分钟后,EP device卡死重启。

Got it. We will try to reproduce this error first.

Could you also share the full test steps/ commands?

1 Like

参考链接:PCIe Endpoint Mode — Jetson Linux<br/>Developer Guide 34.1 documentation
测试步骤:
1、准备两个Xavier

2、其中一个Xavier为root port mode
p2972-0000.conf.common:ODMDATA=0x9190000;

3、另外一个Xavier为endpoint mode
p2972-0000.conf.common:ODMDATA=0x9191000;
两个Xavier烧写完成后,通过pcie转接板连接。

4、先启动endpoint mode Xavier,执行下面命令:
cd /sys/kernel/config/pci_ep/
sudo mkdir functions/pci_epf_tvnet/func1
sudo chmod 777 functions/pci_epf_tvnet/func1/msi_interrupts
echo 16 > functions/pci_epf_tvnet/func1/msi_interrupts
sudo ln -s functions/pci_epf_tvnet/func1 controllers/141a0000.pcie_ep/
sudo chmod 777 controllers/141a0000.pcie_ep/start
echo 1 > controllers/141a0000.pcie_ep/start

5、endpoint端命令执行完成后,给root mode端 Xavier上电启动,执行lspci命令如下:

6、endpoint端:sudo ifconfig eth1 192.168.2.1 up
root端:sudo ifconfig eth1 192.168.2.2 up

7、endpoint端信息:

8、执行ping命令,不能正常通信。

同样的步骤使用jetpack32.4.4可以正常通信

Sorry for the late response, our team are doing the investigation, will have the update soon. Thanks

Hi,

Please add this patch to your EP side kernel and test again.

diff --git a/kernel/kernel-5.10/drivers/pci/controller/dwc/pcie-tegra194.c b/kernel/kernel-5.10/drivers/pci/controller/dwc/pcie-tegra194.c
index 7fde1a4d6..b058c3cb8 100644
--- a/kernel/kernel-5.10/drivers/pci/controller/dwc/pcie-tegra194.c
+++ b/kernel/kernel-5.10/drivers/pci/controller/dwc/pcie-tegra194.c
@@ -2164,7 +2164,9 @@ static void tegra_pcie_enable_legacy_interrupts(struct pcie_port *pp)
 	val |= APPL_INTR_EN_L1_8_INTX_EN;
 	val |= APPL_INTR_EN_L1_8_AUTO_BW_INT_EN;
 	val |= APPL_INTR_EN_L1_8_BW_MGT_INT_EN;
+#if 0
 	val |= APPL_INTR_EN_L1_8_EDMA_INT_EN;
+#endif
 	if (IS_ENABLED(CONFIG_PCIEAER))
 		val |= APPL_INTR_EN_L1_8_AER_INT_EN;
 	appl_writel(pcie, val, APPL_INTR_EN_L1_8_0);
@@ -3569,9 +3571,11 @@ static void pex_ep_event_pex_rst_deassert(struct tegra_pcie_dw *pcie)
 	val |= APPL_INTR_EN_L1_0_0_RDLH_LINK_UP_INT_EN;
 	appl_writel(pcie, val, APPL_INTR_EN_L1_0_0);
 
+#if 0
 	val = appl_readl(pcie, APPL_INTR_EN_L1_8_0);
 	val |= APPL_INTR_EN_L1_8_EDMA_INT_EN;
 	appl_writel(pcie, val, APPL_INTR_EN_L1_8_0);
+#endif
 
 	if (pcie->enable_cdm_check) {
 		val = appl_readl(pcie, APPL_INTR_EN_L0_0);
@@ -3880,6 +3884,7 @@ tegra_pcie_ep_get_features(struct dw_pcie_ep *ep)
 	return &tegra_pcie_epc_features;
 }
 
+#if 0
 /* Reserve BAR0_BASE + BAR0_MSI_OFFSET of size SZ_64K as MSI page */
 static int tegra_pcie_ep_set_bar(struct dw_pcie_ep *ep, u8 func_no,
 				 struct pci_epf_bar *epf_bar)
@@ -3903,11 +3908,11 @@ static int tegra_pcie_ep_set_bar(struct dw_pcie_ep *ep, u8 func_no,
 
 	return 0;
 }
+#endif
 
 static struct dw_pcie_ep_ops pcie_ep_ops = {
 	.raise_irq = tegra_pcie_ep_raise_irq,
 	.get_features = tegra_pcie_ep_get_features,
-	.set_bar = tegra_pcie_ep_set_bar,
 };
 
 static int tegra_pcie_config_ep(struct tegra_pcie_dw *pcie,

Hi @WayneWWW,
i tryed your patch on the endpoint, but i still see errors on the endpoint side.
You can find the syslog of the endpoint below.

My setup are two AGX Xavier and now root system doesn’t shut down normally anymore (not sure if it is related)

Sorry that could you always share your log as text file.

Also, what is the error here? Not able to ping?

Also, are you a co-worker of the original owner of this post?

Sorry for just a picture,
the endpoint doesn’t get the network, that’s makes it a bit hard get a text file…

My Plan is to get a AGX Xavier into the endpoint mode and get the virtual network between to Xavier.
I have done this multiple times with the old jetpack, so the steps are clear to me, but I do not get it up and running with the new jetpack. First as the original owner of this post the endpoint Xavier stuck and restarts.
And then I tried the patch you posted. This is where I am now.

The endpoint xavier is not restarting anymore, but the network is not working as well.
Not able to ping from both sides. The endpoint is just returning “Destination Host Unreachable”.
On the root Xavier the USB driver is stuck now and I cannot use the mouse or keyboard anymore… Also, the ethernet connection is dieing as well.

PS.: I am not related to the original owner, but facing the same setup and problem

PS.: I am not related to the original owner, but facing the same setup and problem

please file you own topic then.

sorry, did not know, that i should start the same stuff again… but I did now