Agx-Orin: PCIe RP could read ram memory after increasing aperture size for mapping non-prefetchable BARs of endpointson RP

Platform:
EP: Jetson AGX Orin industrial. R35.6.
RP: Jetson AGX Orin industrial. R35.6.

I am using two Orin modules for RP/EP communication.
and I have made mofifications on ep and rp according the the link.
Increasing size of BAR0 in Endpoint Mode - Jetson & Embedded Systems / Jetson AGX Xavier - NVIDIA Developer Forums

On Root Port side:PCIe C5 for RP.
Increasing aperture size for mapping non-prefetchable BARs of endpoints from its current 128MB size to 1GB size.
I modified pcie c5 node of device tree.
pcie@141a0000{


}

On end point side:
I modifed the file ‘/drivers/pci/endpoint/functions/pci-epf-nv-test.c’. and change the BAR0_SIZE to BAR0_SIZE SZ_64M , BAR0_SIZE SZ_128M or BAR0_SIZE SZ_512M. and also added other patch according to the link that has been mentioned above.

the modified file:
pci-epf-nv-test.txt (7.3 KB)

After these modifications have been made,the RP Could detect the bar size has increased. But on the RP , it could not read the ram memory. My final goal is to increase BAR0 size more than 512M .

Test flows :

  1. on ep:
    cd /sys/kernel/config/pci_ep/;
    mkdir functions/pci_epf_nv_test/func1;
    echo 0x10de > functions/pci_epf_nv_test/func1/vendorid;
    echo 0x0001 > functions/pci_epf_nv_test/func1/deviceid;
    ln -s functions/pci_epf_nv_test/func1 controllers/141a0000.pcie_ep/;
    echo 1 > controllers/141a0000.pcie_ep/start;

dmesg :

  1. on RP:
    lspci -v
    setpci -s 0005:01:00.0 COMMAND=0x02
    busybox devmem + addr

By the way ,there is another test that can work well. If I don’t change the RP ,that means , if I don’t aperture size for mapping non-prefetchable BARs and keep its current 128MB. On the RP , It can read the ram memory.

In the end, I found no matterwhich bar0 size of endpoint is, including BAR0_SIZE SZ_64M , BAR0_SIZE SZ_128M or BAR0_SIZE SZ_512M. Once I modified the aperture size for mapping non-prefetchable BARs of endpoints from 128M to 1024M on RP, The RP could not read ram memory.

Do you have any suggestions on this problem?
Thanks!

Hi,
If the device cannot be flashed/booted, please refer to the page to get uart log from the device:
Jetson/General debug - eLinux.org
And get logs of host PC and Jetson device for reference. If you are using custom board, you can compare uart log of developer kit and custom board to get more information.
Also please check FAQs:
Jetson AGX Orin FAQ
If possible, we would suggest follow quick start in developer guide to re-flash the system:
Quick Start — NVIDIA Jetson Linux Developer Guide 1 documentation
And see if the issue still persists on a clean-flashed system.
Thanks!

Maybe you misunderstand the problem. It seems like you answer has nothing to do with my question on this topic.

On Root Port side:PCIe C5 for RP.
Increasing aperture size for mapping non-prefetchable BARs of endpoints from its current 128MB size to 1GB size.
I modified pcie c5 node of device tree.
pcie@141a0000{

This is not allowed on Orin AGX. Only adjusting BAR0_SIZE is available.

Hi, WayneWWW:

I redo all the tests and record logs.
TEST 1:
I keep default configuration on EP/RP. It means I use the default “define BAR0_SIZE SZ_64K” on the EP side, And I don’t increase aperture size for mapping non-prefetchable BARs of endpoints on RP. it indicts that bar0 only 4K could read and write on RP.
The test log on EP:
orin-agx20241121-ep.log (3.2 KB)

The test log on RP:
orin-agx20241121-rp-only-4k-can-read.log (3.4 KB)
On the root port side,RP read over 4k.

root@orin-desktop:/home/orin# busybox devmem 0x2b28001000
0xFFFFFFFF
root@orin-desktop:/home/orin# 

on EP side,when RP read over 4k. there are some error message :

root@orin-desktop:/home/orin# [  909.345143] arm-smmu 12000000.iommu: Unhandled context fault: fsr=0x402, iova=0xffff1000, fsynr=0x3a0003, cbfrsynra=0x414, cb=5
[  909.357208] mc-err: (255) csr_pcie5r: EMEM address decode error
[  909.363319] mc-err:   status = 0x200640e2; hi_addr_reg = 0x000000ff addr = 0xffffffff00
[  909.371572] mc-err:   secure: yes, access-type: read
[  909.376688] mc-err: MC fault - no status: ECC scrub complete

TEST 2:
I keep the default configuration on the EP side, that means I use the default '‘pci-epf-nv-test.c’ on EP. But I increase aperture size for mapping non-prefetchable BARs of endpoints from 128M to 1024M on RP. when EP and RP both booted, RP could recongnize the BAR0 ,but it could read and write BAR0.

The test log on RP:

root@orin-desktop:/home/orin# 
root@orin-desktop:/home/orin# lspci -v
0005:00:00.0 PCI bridge: NVIDIA Corporation Device 229a (rev a1) (prog-if 00 [Normal decode])
        Flags: bus master, fast devsel, latency 0, IRQ 68
        Bus: primary=00, secondary=01, subordinate=ff, sec-latency=0
        I/O behind bridge: [disabled]
        Memory behind bridge: 40000000-400fffff [size=1M]
        Prefetchable memory behind bridge: 0000002740000000-00000027400fffff [size=1M]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [70] Express Root Port (Slot-), MSI 00
        Capabilities: [b0] MSI-X: Enable- Count=8 Masked-
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [148] Secondary PCI Express
        Capabilities: [168] Physical Layer 16.0 GT/s <?>
        Capabilities: [190] Lane Margining at the Receiver <?>
        Capabilities: [1c0] L1 PM Substates
        Capabilities: [1d0] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?>
        Capabilities: [2d0] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
        Capabilities: [308] Data Link Feature <?>
        Capabilities: [314] Precision Time Measurement
        Capabilities: [320] Vendor Specific Information: ID=0004 Rev=1 Len=054 <?>
        Capabilities: [388] Vendor Specific Information: ID=0006 Rev=0 Len=018 <?>
        Kernel driver in use: pcieport

0005:01:00.0 RAM memory: NVIDIA Corporation Device 0001
        Flags: fast devsel, IRQ 255
        Memory at 2af0000000 (32-bit, non-prefetchable) [size=64K]
        Memory at 2740000000 (64-bit, prefetchable) [size=128K]
        Memory at 2af0010000 (64-bit, non-prefetchable) [size=4K]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [70] Express Endpoint, MSI 00
        Capabilities: [b0] MSI-X: Enable- Count=8 Masked-
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [148] Secondary PCI Express
        Capabilities: [168] Physical Layer 16.0 GT/s <?>
        Capabilities: [190] Lane Margining at the Receiver <?>
        Capabilities: [1b8] Latency Tolerance Reporting
        Capabilities: [1c0] L1 PM Substates
        Capabilities: [1d0] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?>
        Capabilities: [2d0] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
        Capabilities: [308] Data Link Feature <?>
        Capabilities: [314] Precision Time Measurement
        Capabilities: [320] Vendor Specific Information: ID=0003 Rev=1 Len=054 <?>
        Capabilities: [388] Vendor Specific Information: ID=0006 Rev=0 Len=018 <?>

root@orin-desktop:/home/orin# 
root@orin-desktop:/home/orin# busybox devmem 0x2af0000000
0xFFFFFFFF
root@orin-desktop:/home/orin# setpci -s 0005:01:00.0 COMMAND=0x02
root@orin-desktop:/home/orin# busybox devmem 0x2af0000000
0xFFFFFFFF
root@orin-desktop:/home/orin# 
root@orin-desktop:/home/orin# 

TEST 3:
I modifiy the BAR0_SIZE to 128M in the file ‘pci-epf-nv-test.c’, and then recompile the kernel.

//#define BAR0_SIZE SZ_64K
//#define BAR0_SIZE SZ_512M
#define BAR0_SIZE SZ_128M

the whole code of ‘pci-epf-nv-test.c’ is as follows:
pci-epf-nv-test.txt (7.3 KB)

if I don’t increase aperture size for mapping non-prefetchable BARs on RP. the BAR0 Could not been recongized by using ‘lspci -v’.

root@orin-desktop:/home/orin# lspci -v
0005:00:00.0 PCI bridge: NVIDIA Corporation Device 229a (rev a1) (prog-if 00 [Normal decode])
        Flags: bus master, fast devsel, latency 0, IRQ 68
        Bus: primary=00, secondary=01, subordinate=ff, sec-latency=0
        I/O behind bridge: [disabled]
        Memory behind bridge: [disabled]
        Prefetchable memory behind bridge: 0000002740000000-00000027400fffff [size=1M]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [70] Express Root Port (Slot-), MSI 00
        Capabilities: [b0] MSI-X: Enable- Count=8 Masked-
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [148] Secondary PCI Express
        Capabilities: [168] Physical Layer 16.0 GT/s <?>
        Capabilities: [190] Lane Margining at the Receiver <?>
        Capabilities: [1c0] L1 PM Substates
        Capabilities: [1d0] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?>
        Capabilities: [2d0] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
        Capabilities: [308] Data Link Feature <?>
        Capabilities: [314] Precision Time Measurement
        Capabilities: [320] Vendor Specific Information: ID=0004 Rev=1 Len=054 <?>
        Capabilities: [388] Vendor Specific Information: ID=0006 Rev=0 Len=018 <?>
        Kernel driver in use: pcieport

0005:01:00.0 RAM memory: NVIDIA Corporation Device 0001
        Flags: fast devsel, IRQ 255
        Memory at 2740000000 (64-bit, prefetchable) [disabled] [size=128K]
        Memory at <unassigned> (64-bit, non-prefetchable) [disabled]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [70] Express Endpoint, MSI 00
        Capabilities: [b0] MSI-X: Enable- Count=8 Masked-
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [148] Secondary PCI Express
        Capabilities: [168] Physical Layer 16.0 GT/s <?>
        Capabilities: [190] Lane Margining at the Receiver <?>
        Capabilities: [1b8] Latency Tolerance Reporting
        Capabilities: [1c0] L1 PM Substates
        Capabilities: [1d0] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?>
        Capabilities: [2d0] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
        Capabilities: [308] Data Link Feature <?>
        Capabilities: [314] Precision Time Measurement
        Capabilities: [320] Vendor Specific Information: ID=0003 Rev=1 Len=054 <?>
        Capabilities: [388] Vendor Specific Information: ID=0006 Rev=0 Len=018 <?>

root@orin-desktop:/home/orin# 

And I also see some error info in the kernel message .
The whole log:
orin-agx20241121-rp-dont-detect-bar0-128M.log (71.1 KB)

The kernel message indicts that there is no space for BAR0.

[   12.341580] pci 0005:00:00.0: BAR 14: no space for [mem size 0x0c000000]
[   12.349150] pci 0005:00:00.0: BAR 14: failed to assign [mem size 0x0c000000]
[   12.357069] pci 0005:00:00.0: BAR 15: assigned [mem 0x2740000000-0x27400fffff 64bit pref]
[   12.366158] pci 0005:01:00.0: BAR 0: no space for [mem size 0x08000000]
[   12.373645] pci 0005:01:00.0: BAR 0: failed to assign [mem size 0x08000000]
[   12.381466] pci 0005:01:00.0: BAR 2: assigned [mem 0x2740000000-0x274001ffff 64bit pref]
[   12.390527] pci 0005:01:00.0: BAR 4: no space for [mem size 0x00001000 64bit]
[   12.398541] pci 0005:01:00.0: BAR 4: failed to assign [mem size 0x00001000 64bit]
[   12.406903] pci 0005:00:00.0: PCI bridge to [bus 01-ff]
[   12.412939] pci 0005:00:00.0:   bridge window [mem 0x2740000000-0x27400fffff 64bit pref]
[   12.421943] pci 0005:00:00.0: Max Payload Size set to  256/ 256 (was  256), Max Read Rq  512
[   12.431352] pci 0005:01:00.0: Max Payload Size set to  256/ 256 (was  256), Max Read Rq  512

TEST 4:
But if I increase aperture size for mapping non-prefetchable BARs of endpoints from 128M to 1024M on RP, It can be detected. But RP also could not read ram memory by using ‘busybox devmem’.

root@orin-desktop:/home/orin# lspci -v
0005:00:00.0 PCI bridge: NVIDIA Corporation Device 229a (rev a1) (prog-if 00 [Normal decode])
        Flags: bus master, fast devsel, latency 0, IRQ 68
        Bus: primary=00, secondary=01, subordinate=ff, sec-latency=0
        I/O behind bridge: [disabled]
        Memory behind bridge: 40000000-4bffffff [size=192M]
        Prefetchable memory behind bridge: 0000002740000000-00000027400fffff [size=1M]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [70] Express Root Port (Slot-), MSI 00
        Capabilities: [b0] MSI-X: Enable- Count=8 Masked-
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [148] Secondary PCI Express
        Capabilities: [168] Physical Layer 16.0 GT/s <?>
        Capabilities: [190] Lane Margining at the Receiver <?>
        Capabilities: [1c0] L1 PM Substates
        Capabilities: [1d0] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?>
        Capabilities: [2d0] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
        Capabilities: [308] Data Link Feature <?>
        Capabilities: [314] Precision Time Measurement
        Capabilities: [320] Vendor Specific Information: ID=0004 Rev=1 Len=054 <?>
        Capabilities: [388] Vendor Specific Information: ID=0006 Rev=0 Len=018 <?>
        Kernel driver in use: pcieport

0005:01:00.0 RAM memory: NVIDIA Corporation Device 0001
        Flags: fast devsel, IRQ 255
        Memory at 2af0000000 (32-bit, non-prefetchable) [disabled] [size=128M]
        Memory at 2740000000 (64-bit, prefetchable) [disabled] [size=128K]
        Memory at 2af8000000 (64-bit, non-prefetchable) [disabled] [size=4K]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [70] Express Endpoint, MSI 00
        Capabilities: [b0] MSI-X: Enable- Count=8 Masked-
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [148] Secondary PCI Express
        Capabilities: [168] Physical Layer 16.0 GT/s <?>
        Capabilities: [190] Lane Margining at the Receiver <?>
        Capabilities: [1b8] Latency Tolerance Reporting
        Capabilities: [1c0] L1 PM Substates
        Capabilities: [1d0] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?>
        Capabilities: [2d0] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
        Capabilities: [308] Data Link Feature <?>
        Capabilities: [314] Precision Time Measurement
        Capabilities: [320] Vendor Specific Information: ID=0003 Rev=1 Len=054 <?>
        Capabilities: [388] Vendor Specific Information: ID=0006 Rev=0 Len=018 <?>

root@orin-desktop:/home/orin# 
root@orin-desktop:/home/orin# 
root@orin-desktop:/home/orin# setpci -s 0005:01:00.0 COMMAND=0x02
root@orin-desktop:/home/orin# busybox devmem 0x2af0000000
0xFFFFFFFF
root@orin-desktop:/home/orin# 

In the final,from what you have said. Increasing aperture size for mapping non-prefetchable BARsnot allowed on Orin AGX. Only adjusting BAR0_SIZE is available. So here are some questions:

  1. In TEST 1, RP side only can read 4k bar0 ram memory. what should I do to read more than 4K?
  2. Adjusting BAR0_SIZE is only on EP , when I want to increase BAR0 size >= 128M , in order to make RP recongize BAR0. what should I do on RP? it about TEST3.

Do you have any advice?

I am looking forward to you reply.
Thanks!