PCIe ep Test Fail on AGX orin:RP DMA address is null .Version:R36.3

Platform:
EP: Jetson AGX Orin industrial. R36.3.
RP:Jetson AGX Xavier industrial. R35.3.1

Hello, I am trying to change PCIE x8(C5) as endpoint mode on agx orin,so I changed somethings as follows:

  1. Add ODMDATA.
    ODMDATA=“gbe-uphy-config-0,nvhs-uphy-config-1,hsio-uphy-config-16,hsstp-lane-map-3”;
  2. flash the device.
    sudo ./flash.sh jetson-agx-orin-devkit-industrial internal

I use agx xavier as RP, agx orin as EP.
After I reflashed and rebooted the agx-orin, i follow the link to configure agx orin.
PCIe Endpoint Mode — NVIDIA Jetson Linux Developer Guide 1 documentation.

Then I boot agx xavier and use ‘lspci -v’ to list device. I can find ep device 229a. That means agx-orin’s pcie C5 has been flashed ep mode.

but when I follow the link to test, it failed.
https://docs.nvidia.com/jetson/archives/r36.3/DeveloperGuide/SD/Communications/PcieEndpointMode.html?highlight=pcie%20endpoint#execution

I execute ‘chmod 777 edmalib_test’ and ‘cat edmalib_test’.
it returns error message: 'RP DMA address is null '.

Do I need to insmod another driver on RP device(In my case,the RP is agx xavier)?
Or is there any other configuration that I need to add?

I am looking forward to you reply.

Hi,

If you are designing a custom base board, then it means some adaptation configurations are needed.
Otherwise, your board may not work fine.

For Orin AGX series, you could refer to below document
https://docs.nvidia.com/jetson/archives/r36.3/DeveloperGuide/HR/JetsonModuleAdaptationAndBringUp/JetsonAgxOrinSeries.html?highlight=universal%20serial%20bus#jetson-agx-orin-platform-adaptation-and-bring-up
(please be aware that above link is for rel-36.3/jetpack6.0)

This document includes below configuration

  1. pinmux change & GPIO configuration
  2. EEPROM change as most custom boards do not have an EEPROM on it.
  3. Kernel porting
  4. PCIe configuration
  5. USB configuration
  6. MGBE configuration
  7. RGMII configuration

Thanks!

Hello,

For Orin AGX,I have done some adaptation configurations on the custom base board, including :

  1. pinmux change & GPIO configuration
    I use PCIe x8 (C7) as RP mode , PCIe x8 (C5) as EP mode. so I changed ODMDATA.
    ODMDATA=“gbe-uphy-config-0,nvhs-uphy-config-1,hsio-uphy-config-16,hsstp-lane-map-3”;
    and I update the pinmux:
    image

  2. EEPROM change as most custom boards do not have an EEPROM on it.

  3. PCIe configuration


PCIe x8 (C7) as RP mode is for Non-Volatile memory controller ,and now it is work well.

  1. MGBE configuration
    ethernet@6810000{
    status=“disabled”;
    }

  2. RGMII configuration
    ethernet@2310000{
    status=“okay”;
    }

and now ethernet and C7 as RP mode both work well.

For AGX xavier, I don’t change anything, and use devkit. I use PCIe x8 (C5 ) for RP mode on AGX xavier.

Now I connect AGX Orin’s C5 (EP) to AGX xavier’s C5 (RP). On AGX xavier, when I use 'lspci ’ , it could find the EP device . when I use “cat edmalib_test” on orin ,it test failed.
image

so I want to know what may cause this problem.
And I also hope to know when I use AGX xavier as RP device, whether I need to install driver on AGX xavier.

Thanks.

Do you have other Orin AGX that can be used as RP here instead of AGX Xavier?

Yes, but it is not devkit. Another Orin agx only can run on customed board.
I will let you know after I try it.

It is okay when I use another Orin AGX as RP.
The test result:
EP:

RP:

Based on the result, there must be some configuration at the RP. My case finally will use x86-64 PC (ubuntu 22.04)as RP. Do you have any suggestions to configure the host?

Thanks!

Sorry, actually this driver is not able to run with x86-64 machine. It is only tested with arm64 RP as Orin agx.

Okay. There is another way to test . it is from R35.4.1 . But it also don’t work well. Can you give me a favor?
PCIe Endpoint Mode — Jetson Linux Developer Guide documentation


When I use another Orin agx as RP. This method does not work well.

for example,as you can see in this image.

On the EP device ,the bar0 phy_addr is 0xf0000000. On the RP device, the memory addr should be ‘memory at 0x2800000000’.

My test step:

  1. On the endpoint device:

busybox devmem 0xf0000000 32 0x98765432

busybox devmem 0xf0000000 : I can see 0x98765432.

  1. On the root port system:

busybox devmem 0x2800000000

I could not see 0x98765432 on the rp device.

image

  1. On the root port system:

busybox devmem 0x2800000000 32 0x12345678

  1. On the endpoint device:

busybox devmem 0xf0000000

I could not see 0x12345678 on the ep device. it still remains 0x98765432
image

Do you have any suggestions about this method?
Thanks!

Hi, WayneWWW

I still work on this case.And I find another topic,I thought it is similar with my case.

In the end, I did not find the solution, but it suggested to use vnet.
But in R36.3. I don’t find the configuration to use Tegra vnet driver from the DeveloperGuide. It seems that the guide of vnet had been deleted. Does it have any problem?

Hi, WayneWWW:

I try with this way on R36.3, this method is from R35.4.1:
https://docs.nvidia.com/jetson/archives/r35.4.1/DeveloperGuide/text/SD/Communications/PcieEndpointMode.html#bringing-up-an-ethernet-interface-over-pcie

On R36.3 ,I only changed two places in kernel/kernel-jammy-src/drivers/pci/controller/dwc/pcie-tegra194.c ,and I re-compiled the kernel .

  1. On EP device ,I repalced Image and insmod pci-epf-tegra-vnet.ko. Then I configure EP device withe command:

#cd /sys/kernel/config/pci_ep/
#mkdir functions/pci_epf_tvnet/func1
echo 16 > functions/pci_epf_tvnet/func1/msi_interrupts
#ln -s functions/pci_epf_tvnet/func1 controllers/141a0000.pcie-ep/
echo 1 > controllers/141a0000.pcie-ep/start

The eth1 appear. and I noticed the PCIe link is not up .
2.On RP device . when I reboot RP device. It can find device when I used 'lspci ', but the eth1 wasn’t be created

Could you please give me a favor to check this case?
Thanks!

Hi WayneWWW:

I changed the BSP to R35.6, and do some other tests. I open a new topic , I think you are a expert and I really hope you can give me some suggestions on the follow topic. Thanks.

Agx-Orin: PCIe RP could read ram memory after increasing aperture size for mapping non-prefetchable BARs of endpointson RP - Jetson & Embedded Systems / Jetson AGX Orin - NVIDIA Developer Forums

I am looking forward to you reply.

Use rel-35.6 to make sure your hardware is really OK first. I mean both RP and EP side using same rel-35.6 BSP.

Yes, I reflashed EP and RP device with same rel-35.6 BSP.

Share the exact error log as text file and exact steps you are using for testing things.

Your whole post is full log screenshot and weird format…hardly reading. Also, none of your log is ever provided in rel-35.6… I saw they are all rel-36 things which do not help for a rel-35 case.

Hi WayneWWW:

I have update test log on the follow topic:

Agx-Orin: PCIe RP could read ram memory after increasing aperture size for mapping non-prefetchable BARs of endpointson RP - Jetson & Embedded Systems / Jetson AGX Orin - NVIDIA Developer Forums

I am looking forward to you reply.
Thanks!