Enabling Orin Dev Kit PCIe EP mode

Hi,

Wanting to set Orin Dev Kit as an endpoint PCIe C5 EP to be connected over PCIe crossover cable, and am following instructions under Enable PCIe in a Customer CVB Design.

Step 1: in p3701.conf.common edit line 164 to configuration #2, ODMDATA=“gbe-uphy-config-0,hsstp-lane-map-3,hsio-uphy-config-16,**nvhs-uphy-config-0”;. There is broken link to T23x BCT Deployment Guide.

Step 3: in Jetson_AGX_Orin_Pinmux_Config_Template_082422.xlsm, edit rows 205:283 columns AS, AT, AY to GPIO(rsvd1), SFIO(PE*_CLKREQ_L), Input where * is 0-10. Changing column “Customer Usage” to respective values is not allowed, “This value doesn’t match the data validation restrictions defined for this cell.”?

Step 2 (listed after step 3.): in tegra234-p3737-pcie.dtsi line 43 under pcie_ep@141a0000 node “Add the pipe2uphy phandle entries as a phy property” “pipe2uphy DT nodes are defined in SoC DT”. Can’t find those values from tegra234-soc-pcie.dtsi below line 417 under node pcie_c5_ep: pcie_ep@141a0000. It is not clear what is the syntax for pipe2uphy, is it referring to phys, phy-names, also is status = “disabled”; supposed to be enabled?

Step 3 (listed after first step 3.): in tegra234-p3737-pcie.dtsi line 43 under pcie_ep@141a0000 node, “add the reset-gpios property with the gpio phandle, the gpio number connected to PERST# and flags(GPIO_ACTIVE_LOW)”. It is not clear what is the syntax for reset-gpios, gpio handle number?

Can these steps be given little more clarity as to what is the exact correct syntax of expected entries, or an example like patch given for PCIe x1 (C0), PCIe x8 (C7) in RP, just for PCie C5 EP pcie_ep@141a0000?

Thanks.

Hi,

Is this is devkit case, I don’t think you need to do those work on pinmux/device tree. It shall be handled already.

https://docs.nvidia.com/jetson/archives/r35.1/DeveloperGuide/text/HR/JetsonModuleAdaptationAndBringUp/JetsonAgxOrinSeries.html?highlight=endpoint

Hi,

So steps in Enable PCIe in a Customer CVB Design should be skipped and steps in Bring up Tegra PCIe Endpoint Mode will still put C5 into EP mode (pcie_ep@141a0000), even though section uses Tegra, Xavier, Orin interchangeably?

In subsection Hardware Requirements for a Tegra PCIe Endpoint Mode:

Step 3: is Orin crossover cable same as Xavier Jetson_AGX_Xavier_PCIe_Endpoint_Design_Guidelines.pdf Figure 3.?

In subsection Enabling the PCIe Endpoint on a Jetson AGX Orin Devkit:

Step 1: p3701.conf.common leave config #1 edit ODMDATA=“gbe-uphy-config-22,hsstp-lane-map-3,nvhs-uphy-config-1,hsio-uphy-config-0,gbe0-enable-10g”;
Step 2: flash Orin with sudo ./flash.sh jetson-agx-orin-devkit mmcblk0p1

In subsection Connecting and Configuring the Tegra PCIe Endpoint System:

Step 3: once booted, not sure if mount -t configfs none /sys/kernel/config needs to be executed (pci-endpoint-cfs.rst line 20)? Edit /sys/kernel/config/pci_ep/functions/pci_epf_nv_test/func1/vendorid and deviceid. If /sys/kernel/config/pci_ep/controllers/141a0000.pcie_ep/start shows up, edit. Boot RP system, proceed to Testing PCIe Endpoint Support steps.

Thanks.

Hi,

It appears that Jetson AGX Orin Platform Adaptation and Bring-Up has been updated, but PCIe EP section has somewhat regressed since Bring up Tegra PCIe Endpoint Mode with 3 steps in Sep 27 resembling Jetson AGX Xavier PCIe Endpoint Mode has disappeared? The Jetson AGX Orin Platform Adaptation and Bring-Up — Jetson Linux<br/>Developer Guide 34.1 documentation provided in Sep 25 points to top of that page and to skip Enable PCIe in a Custom CVB design section, but since the subsequent section(s) has disappeared, it is not clear what are the specific steps how to setup Orin Kit PCIe C5 into EP mode. Have these sections/instructions migrated into some other document given that the original content has changed since, also is the PCIe crossover cable same as before?

Thanks.

Sorry that what is the exact thing missing here? It is hard to understand what you want to say with so much links.

Why not just tell us which steps that you feel not right?

PCIe EP mode setup is over this page.

https://docs.nvidia.com/jetson/archives/r35.1/DeveloperGuide/text/SD/Communications/PcieEndpointMode.html

Hi,

Little confusion due to web page update, PCIe EP mode setup content used to be embedded in Bring up Tegra PCIe Endpoint Mode can be found now in Flashing the PCIe Endpoint on a Jetson AGX Orin Series System.

Following instructions was able to change p3701.conf.common, flash, change tegra_defconfig, rebuild kernel, and execute all the steps upto and including busybox devmem 0x4307b8000 32 0xfa950000. However setpci -s 0005:01:00.0 COMMAND=0x02 step fails with
Warning: No devices selected for “COMMAND=0x02”
is it predicated on correct PCIe cabling and RP booted or something else, any other diagnostic command to run on Orin EP standalone?

Instructions do not explicitly specify C5 Pinmux, is it already configured for Dev Kit by default or there is an extra step?

Crossover cable schematics in Jetson_AGX_Xavier_PCIe_Endpoint_Design_Guidelines.pdf


and “Note that the power rails of each connector have different net-names, so they are not
connected.” Does same schematic still apply for Orin Dev Kit EP, and if yes, does it read that all pins below AB11 are disconnected, or that only power pins are disconnected and some AB1-10 still connected PCI Express - Wikipedia?

Thanks.

Hi,

If this is Orin devkit, then you only need to change the ODMDATA inside p3701.conf.common…
You don’t need to do anything else. For example, pinmux change is not need, tegra_defconfig is also not needed.

I am actually not sure why you think you need to run so many extra steps to make it work…

Hi,

So pimux is already set, thanks. tegra_defconfig CONFIG_STRICT_DEVMEM=n seems to be needed otherwise busybox devmem 0x4307b8000 fails even with sudo. Not sure that there are any extra setup/configuration steps needed but maybe any diagnostic step, since there is an error at the last step setpci -s 0005:01:00.0 COMMAND=0x02? This command is to allow the PCIe endpoint to respond to PCIe memory accesses on the root port system.

Cables (swap board) to connect the two devices, Nvidia Jetson AGX Xavier PCIe Endpoint Design Guidelines (DA-09357) Figure 3., there is a note about power rails not being connected. Just to confirm no pins below A11, B11 are connected not even ground pins A4, B4?

Thanks.

I don’t get what you mean. The pin connections are listed in figure 3. Ground pins are connected.

Hi,

Meant ground to ground pins A4<->B4, A18<->B18, A49<->B49.

The setpci -s 0005:01:00.0 COMMAND=0x02 Warning: No devices selected for “COMMAND=0x02”, any insight or additional logging/debug command around it?

The setpci -s 0001:00:00.0 COMMAND=0x02 succeeds but on RP busybox devmen read from and write to 0x70000000 does not have any effect, read is always 0xffffffff doesn’t match Orin 0xfa950000 at 0x199a3a000, and write doesn’t overwrite the value.

On Orin EP:

dmesg | grep pci_epf_nv_test
[ 149.154801] pci_epf_nv_test pci_epf_nv_test.0: BAR0 RAM phys: 0x199a3a000

lspci -v
0001:00:00.0 PCI bridge: NVIDIA Corporation Device 229e (rev a1) (prog-if 00 [Normal decode])
Flags: fast devsel, IRQ 52
Bus: primary=00, secondary=01, subordinate=ff, sec-latency=0
I/O behind bridge: 00001000-00001fff [size=4K]
Memory behind bridge: 40000000-400fffff [size=1M]
Prefetchable memory behind bridge: [disabled]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [70] Express Root Port (Slot-), MSI 00
Capabilities: [b0] MSI-X: Enable- Count=1 Masked-
Capabilities: [100] Advanced Error Reporting
Capabilities: [148] Secondary PCI Express
Capabilities: [158] Physical Layer 16.0 GT/s <?> Capabilities: [17c] Lane Margining at the Receiver <?>
Capabilities: [190] L1 PM Substates
Capabilities: [1a0] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?> Capabilities: [2a0] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
Capabilities: [2d8] Data Link Feature <?> Capabilities: [2e4] Precision Time Measurement Capabilities: [2f0] Vendor Specific Information: ID=0004 Rev=1 Len=054 <?>
Capabilities: [358] Vendor Specific Information: ID=0006 Rev=0 Len=018 <?>
Kernel driver in use: pcieport

On RP:

lspci -v
0000:01:00.0 RAM memory: NVIDIA Corporation Device 0001
Flags: fast devsel, IRQ 255, NUMA node 0
Memory at 70000000 (32-bit, non-prefetchable) [disabled] [size=64K]
Memory at 3c0000000000 (64-bit, prefetchable) [disabled] [size=128K]
Memory at 70010000 (64-bit, non-prefetchable) [disabled] [size=4K]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
Capabilities: [70] Express Endpoint, MSI 00
Capabilities: [b0] MSI-X: Enable- Count=8 Masked-
Capabilities: [100] Advanced Error Reporting
Capabilities: [148] Secondary PCI Express
Capabilities: [168] Physical Layer 16.0 GT/s <?> Capabilities: [190] Lane Margining at the Receiver <?>
Capabilities: [1b8] Latency Tolerance Reporting
Capabilities: [1c0] L1 PM Substates
Capabilities: [1d0] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?> Capabilities: [2d0] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
Capabilities: [308] Data Link Feature <?> Capabilities: [314] Precision Time Measurement Capabilities: [320] Vendor Specific Information: ID=0003 Rev=1 Len=054 <?>
Capabilities: [388] Vendor Specific Information: ID=0006 Rev=0 Len=018 <?>

Thanks.

sorry that I am a little confused by current status. Could you only share the RP “lspci” result to me?

Hi,

The current status is that am able to get past setpci step with different Orin device name than the one in documentation, and RP can see some RAM memory device but is not able to read from remotely or write to it, value is always 0xffffffff. The RP lspci starts with “On RP:” above, there are no other Nvidia entries.

Thanks.

Hi,

No, I feel there is something wrong in the whole thread. Please just share the info I need first.
There is no need to explain by yourself. The log will tell the situation.

  1. lspci result on RP

  2. dmesg on RP and EP

Hi,

Here are the logs.

  1. lspci -v RP.lspci.log (17.5 KB) and EP.lspci.log (1.9 KB)
  2. dmesg RP.dmesg.log (87.1 KB) and EP.dmesg.log (72.7 KB)

Thanks.

Hi,

The document we shared is based on two Orin. RP side is also Orin.

Could you test with that setup too?

Hi,

The document under Hardware Requirements has “you can use any standard x86-64 PC that is running Linux”, and based on PCIe standardization and from some of other Xavier pcie posts, that is to be expected. Unfortunately don’t have another Orin to test with, but the RP PC works with other PCIe cards including even Nvidia GPU and can see Orin as EP device, just can’t read/write from/to it. Has Nvida confirmed that PCie EP workflow works between host PC and Orin dev kit?
Based on EP and RP logs are there any issues that stand out or any other logs we need to run?

Thanks.

Hi,

Below is our operation on x86 host and it is working.

lspci on RP

2022-10-28b3:00.0 RAM memory: NVIDIA Corporation Device 0001

2022-10-28        Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-

2022-10-28        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR+ <PERR- INTx-

2022-10-28        Interrupt: pin A routed to IRQ 11

2022-10-28        NUMA node: 0

2022-10-28        Region 0: Memory at fbe00000 (32-bit, non-prefetchable) [size=64K]

2022-10-28        Region 2: Memory at fbd00000 (64-bit, prefetchable) [size=128K]

2022-10-28        Region 4: Memory at fbe10000 (64-bit, non-prefetchable) [size=4K]

2022-10-28        Capabilities: [40] Power Management version 3

2022-10-28                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-)

2022-10-28                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-

2022-10-28        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+

2022-10-28                Address: 0000000000000000  Data: 0000

2022-10-28                Masking: 00000000  Pending: 00000000

2022-10-28        Capabilities: [70] Express (v2) Endpoint, MSI 00

2022-10-28                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited

2022-10-28                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W

2022-10-28                DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-

2022-10-28                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+

2022-10-28                        MaxPayload 256 bytes, MaxReadReq 512 bytes

2022-10-28                DevSta: CorrErr+ NonFatalErr+ FatalErr- UnsupReq+ AuxPwr+ TransPend-

2022-10-28                LnkCap: Port #0, Speed 16GT/s, Width x8, ASPM not supported

2022-10-28                        ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+

2022-10-28                LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk-

2022-10-28                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-

2022-10-28                LnkSta: Speed 8GT/s (downgraded), Width x8 (ok)

2022-10-28                        TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-

2022-10-28                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR+

2022-10-28                         10BitTagComp+ 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-

2022-10-28                         EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-

2022-10-28                         FRS- TPHComp- ExtTPHComp-

2022-10-28                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-

2022-10-28                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,

2022-10-28                         AtomicOpsCtl: ReqEn-

2022-10-28                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-

2022-10-28                LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-

2022-10-28                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-

2022-10-28                         Compliance De-emphasis: -6dB

2022-10-28                LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+ EqualizationPhase1+

2022-10-28                         EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-

2022-10-28                         Retimer- 2Retimers- CrosslinkRes: Upstream Port

2022-10-28        Capabilities: [b0] MSI-X: Enable- Count=8 Masked-

2022-10-28                Vector table: BAR=2 offset=00000000

2022-10-28                PBA: BAR=2 offset=00010000

2022-10-28        Capabilities: [100 v2] Advanced Error Reporting

2022-10-28                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-

2022-10-28                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-

2022-10-28                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-

2022-10-28                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+

2022-10-28                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+

2022-10-28                AERCap: First Error Pointer: 14, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-

2022-10-28                        MultHdrRecCap+ MultHdrRecEn- TLPPfxPres- HdrLogCap-

2022-10-28                HeaderLog: 40000002 000002ff fbe00000 5a5aaa55

2022-10-28        Capabilities: [148 v1] Secondary PCI Express

2022-10-28                LnkCtl3: LnkEquIntrruptEn- PerformEqu-

2022-10-28                LaneErrStat: 0

2022-10-28        Capabilities: [168 v1] Physical Layer 16.0 GT/s <?>

2022-10-28        Capabilities: [190 v1] Lane Margining at the Receiver <?>

2022-10-28        Capabilities: [1b8 v1] Latency Tolerance Reporting

2022-10-28                Max snoop latency: 0ns

2022-10-28                Max no snoop latency: 0ns

2022-10-28        Capabilities: [1c0 v1] L1 PM Substates

2022-10-28                L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2- ASPM_L1.1- L1_PM_Substates+

2022-10-28                          PortCommonModeRestoreTime=60us PortTPowerOnTime=40us

2022-10-28                L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-

2022-10-28                           T_CommonMode=0us

2022-10-28                L1SubCtl2: T_PwrOn=10us

2022-10-28        Capabilities: [1d0 v1] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?>

2022-10-28        Capabilities: [2d0 v1] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>

2022-10-28        Capabilities: [308 v1] Data Link Feature <?>

2022-10-28        Capabilities: [314 v1] Precision Time Measurement

2022-10-28                PTMCap: Requester:+ Responder:- Root:-

2022-10-28                PTMClockGranularity: Unimplemented

2022-10-28                PTMControl: Enabled:- RootSelected:-

2022-10-28                PTMEffectiveGranularity: Unknown

2022-10-28        Capabilities: [320 v1] Vendor Specific Information: ID=0003 Rev=1 Len=054 <?>

2022-10-28        Capabilities: [388 v1] Vendor Specific Information: ID=0006 Rev=0 Len=018 <?>

2022-10-28

Operations:

2022-10-28root@22:/home/nvidia# setpci -s b3:00.0 COMMAND=0x02

2022-10-28root@22:/home/nvidia# devmem2 0xfbe00000 w 0x55aa5a5A

2022-10-28/dev/mem opened.

2022-10-28Memory mapped at address 0x7f7a7ccf0000.

2022-10-28Value at address 0xFBE00000 (0x7f7a7ccf0000): 0x55AA5A5A

2022-10-28Written 0x55AA5A5A; readback 0x55AA5A5A

2022-10-28root@22:/home/nvidia# devmem2 0xfbe00000 w 0x67891234

2022-10-28/dev/mem opened.

2022-10-28Memory mapped at address 0x7fc44cc48000.

2022-10-28Value at address 0xFBE00000 (0x7fc44cc48000): 0x55AA5A5A

2022-10-28Written 0x67891234; readback 0x67891234

2022-10-28root@22:/home/nvidia# devmem2 0xfbe00000 w 0x12345678

2022-10-28/dev/mem opened.

2022-10-28Memory mapped at address 0x7f0abf68d000.

2022-10-28Value at address 0xFBE00000 (0x7f0abf68d000): 0x67891234

2022-10-28Written 0x12345678; readback 0x12345678

2022-10-28root@22:/home/nvidia#

2022-10-28root@22:/home/nvidia#

2022-10-28root@22:/home/nvidia# devmem2 0xfbe00000

2022-10-28/dev/mem opened.

2022-10-28Memory mapped at address 0x7fc8a586e000.

2022-10-28Value at address 0xFBE00000 (0x7fc8a586e000): 0x12345678

Please make sure you disable the CONFIG_STRICT_DEVMEMM config in Orin kernel.

You can use below command to confirm.

root@Orin:/sys/kernel/config/pci_ep# zcat /proc/config.gz |grep CONFIG_STRICT_DEVMEMM
root@Orin:/sys/kernel/config/pci_ep#

Hi,

Thank you for the confirmation, Orin side was ok, missed setpci device name on RP side. devmem2 read and write works remotely both from RP and EP side.

Was trying to run a more realistic bandwidth/latency check with massive data transfer, and can see that there is a way by using RP /sys/kernel/debug/pcie-x/ (where x is one of 0,1,2,3) and cat write. Don’t see that subfolder on RP or other PCs, is it only enabled with CONFIG_PCIE_TEGRA_DW_DMA_TEST=y flag on Jetson? Or can be enabled on regular Linux PC also, closest flags by name seen in kernel .config are DMATEST=y or DMA_API_DEBUG=y?

Was looking also through JetPack kernel source samples, but don’t see direct standalone example with DMA and conversion from virtual to physical/bus address space. There is
kernel/nvidia/drivers/misc/tegra-pcie-ep-mem.c static int write(struct seq_file *s, void *data) and read(struct seq_file *s, void *data)
kernel/nvidia/drivers/pci/host/pcie-tegra-dw.c static int write(struct seq_file *s, void *data) and read(struct seq_file *s, void *data)
but don’t know how to populate these, s and data, and can it be used directly from main() reader on the RP side and main() writer on EP side? Any suggestions what could be a good base source code to run EP writes, RP reads?

Thanks.

Hi,

Changed .config flags
DMATEST=y
DMA_API_DEBUG=y
CONFIG_PCI_ENDPOINT_TEST=y
CONFIG_PCI_EPF_TEST=y

rebuild kernel, after boot, after setpci, in /sys/kernel/debug there is only
drwxr-xr-x 2 root root 0 Dec 31 1969 dma-api
drwxr-xr-x 2 root root 0 Dec 31 1969 dma_buf
no pcie-x, so it is either some other .config flags or pcie-x is specific to Jetson kernel.

Since debug/pcie-x may not be an option on PC RP, can any of
tegra-pcie-ep-mem.c or pcie-tegra-dw.c write() or write_ll() and read() or read_ll()
be potentially called from executable main() on RP and EP, if that is an option, what would be the correct code to populate s and data?

Thanks.

Hi,

Still trying to check the PCIe EP write PC RP read DMA speed. So far it looks like Orin RP can be set for testing using /sys/kernel/debug/pcie-x/cat write based on enabling CONFIG_PCIE_TEGRA_DW_DMA_TEST=y and some patches covered in The bandwidth of of virtual ethernet over PCIe between two xaviers is low. But for PC RP kernel that flag does not exist, there are some PCIe DMA test flags but they don’t enable /sys/kernel/debug/pci-x on RP, also /pci/dma does not show any NV devices. Based on AGX Endpoint PCIe DMA speed test should be doable but it isn’t not clear what needs to be changed in which file and how to build on a PC?

Is there some guidance how to modify the kernel/nvidia/drivers/pci/dwc/pcie-tegra.c that has #ifdef CONFIG_PCIE_TEGRA_DW_DMA_TEST and build in order to eventually enable DMA write from virtual user space to physical/bus address on EP (Orin) and DMA read from physical address (000:01:00.0 NV RAM memory on RP) to user space on RP (PC) done from user space (main()), or does it require writing custom driver? Or maybe using mmap or such from user space.

Thanks