Shared RAM on PCIe Endpoint Device: 'devmem: mmap:' error

Hello, I’ve flashed my AGX Orin Dev kit in PCIe endpoint mode using these instructions.

My root complex device is a Jetson Xavier NX.
Both devices use Jetpack 5.1.1, r35.3.1.

Steps I have done for Endpoint setup, in the following order:

  • Recompile kernel with the necessary change to arch/arm64/configs/tegra_defconfig
  • Make the necessary change to p3701.conf.common to enable EP mode.
  • Successfully flash the AGX Orin with the above changes. (sudo ./flash.sh jetson-agx-orin-devkit mmcblk0p1 command)
  • Connect the devices with my custom PCIe cable (designed using PCIe Endpoint Design Guidelines).
  • Bootup the Endpoint (AGX Orin).
  • Run the following commands on Endpoint:
    cd /sys/kernel/config/pci_ep/
    mkdir functions/pci_epf_nv_test/func1
    echo 0x10de > functions/pci_epf_nv_test/func1/vendorid
    echo 0x0001 > functions/pci_epf_nv_test/func1/deviceid
    ln -s functions/pci_epf_nv_test/func1 controllers/141a0000.pcie_ep/
    echo 1 > controllers/141a0000.pcie_ep/start
  • Bootup the Root port (Xavier NX).
  • Determine Endpoint BAR address:

nvidia@nvidia-agx-orin:/sys/kernel/config/pci_ep$ sudo dmesg | grep pci_epf_nv_test
[ 168.885340] pci_epf_nv_test pci_epf_nv_test.0: BAR0 RAM phys: 0x17082e000
[ 168.885360] pci_epf_nv_test pci_epf_nv_test.0: BAR0 RAM IOVA: 0xffff0000
[ 168.885405] pci_epf_nv_test pci_epf_nv_test.0: BAR0 RAM virt: 0x00000000c2c53987

  • Access the Endpoint RAM - THIS STEP FAILS:

nvidia@nvidia-agx-orin:/sys/kernel/config/pci_ep$ sudo busybox devmem 0x17082e000
devmem: mmap: Operation not permitted

Besides the recompiled kernel step, which I did, is there anything else I may have done wrong?

please check document.

https://docs.nvidia.com/jetson/archives/r35.4.1/DeveloperGuide/text/SD/Communications/PcieEndpointMode.html?highlight=endpoint#prepare-for-testing

→ In 5.x release, Linux kernel has enhanced security, if you want to access shared RAM need to add CONFIG_STRICT_DEVMEM=n to tegra_defconfig and recompile the kernel

@WayneWWW Thanks- It seems my modified kernel wasn’t being properly flashed to the AGX Endpoint before, so the tegra_defconfig change wasn’t taking effect.

Now I’ve got the busybox devmem <BAR0_RAM_phys> command working and can read/write the endpoint RAM.

But even after running the setpci command in the docs, the two Jetsons still don’t seem to recognize each other with lspci -v. I get:

setpci -s 0005:01:00.0 COMMAND=0x02
Warning: No devices selected for “COMMAND=0x02”.

I have tried booting both Root and Endpoint in different orders, but no luck.

If you are talking about lspci is not able to detect, then it is not kernel config thing.

Even hardware problem could lead to this.

My ls /sys/bus/pci/devices/ results are as follows on AGX Orin EP:

root@nvidia-agx-orin:/sys/kernel/config/pci_ep# ls /sys/bus/pci/devices/
0001:00:00.0 0001:01:00.0

So controller 5 isn’t set apparently. I made sure to set the ODMDATA bit before flashing.

I will check internally to determine if it’s a hardware issue. But also, is there any command I can run on the EP to make sure the ODMDATA was indeed set correctly?


Based on advice elsewhere on these forums, I tried binding / unbinding, and this is the dmesg result trace.

Of course, I have echo 1 > controllers/141a0000.pcie_ep/start before this.

You can try to post log as full text log but not screenshot first.

Also, why are you running this on your AGX Orin? You should set it from your RP.

setpci -s 0005:01:00.0 COMMAND=0x0

This was assumed running on both RP and EP are Orin AGX situation.

Hi WayneWWW, revisiting this — my RP is Xavier NX and EP is Orin AGX, not 2 Orin AGX devices.

Would this be a potential reason for failure to detect over PCIe?
As in, is there something extra I must do on the RP or EP side?

Attaching a photo of my setup. The green device you see is a custom card designed using PCIe Endpoint Design Guidelines.

I’m connecting the two using an M.2 to PCIe x16 riser adapter: this one.

Hi,

I am no hardware guy. Cannot comment on whether this connection would work or not.

But I guess it does not work.

Got it, are there any hardware folks you can tag who may be able to help with bringing up an Xavier NX M.2<–> AGX Orin PCIe connection? Or, should I make a separate post?

Here are the additional debug steps I’ve tried:

  • Replacing the Xavier NX RP with a different machine that has an M.2 slot (still did not detect AGX Orin EP). Sadly I do not a second x16 device to test with.
  • Tracing connectivity between the M.2 side’s pins and AGX Orin’s PCIe pins (i.e. crossover is correct)
  • Re-checking custom kernel, ODMData, flash steps - all are fine.
  • different methods of powering the PCIe slot M.2 to x16 adapter (bench 12v supply vs wall adapter - I was concerned the slot was being powered incorrectly, but it has no effect .)
  • Re-enumerating the PCIe bus using echo 1 > /sys/bus/pci/rescan and changing the RP / EP boot order - no effect.

Our team has triple-checked that our custom PCIe crossover board’s buffers are getting power correctly, and it was built with NVIDIA’s design guidelines for PCIe, so quite confident that part is right especially after probing pins.

My thinking: could Orin AGX EP require some special kernel config / patch for x4 that i’m unaware of, since M.2 (RP) supports only upto x4 while the physical AGX slot is x16?

Hi,

Please refer to "Jetson AGX Xavier Series PCIe Endpoint Design Guidelines Application Note ". We didn’t validate what you are doing here, so cannot guarantee.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.