Enabling Orin Dev Kit PCIe EP mode

Hi,

It appears that Jetson AGX Orin Platform Adaptation and Bring-Up has been updated, but PCIe EP section has somewhat regressed since Bring up Tegra PCIe Endpoint Mode with 3 steps in Sep 27 resembling Jetson AGX Xavier PCIe Endpoint Mode has disappeared? The Jetson AGX Orin Platform Adaptation and Bring-Up — Jetson Linux<br/>Developer Guide 34.1 documentation provided in Sep 25 points to top of that page and to skip Enable PCIe in a Custom CVB design section, but since the subsequent section(s) has disappeared, it is not clear what are the specific steps how to setup Orin Kit PCIe C5 into EP mode. Have these sections/instructions migrated into some other document given that the original content has changed since, also is the PCIe crossover cable same as before?

Thanks.

Sorry that what is the exact thing missing here? It is hard to understand what you want to say with so much links.

Why not just tell us which steps that you feel not right?

PCIe EP mode setup is over this page.

https://docs.nvidia.com/jetson/archives/r35.1/DeveloperGuide/text/SD/Communications/PcieEndpointMode.html

Hi,

Little confusion due to web page update, PCIe EP mode setup content used to be embedded in Bring up Tegra PCIe Endpoint Mode can be found now in Flashing the PCIe Endpoint on a Jetson AGX Orin Series System.

Following instructions was able to change p3701.conf.common, flash, change tegra_defconfig, rebuild kernel, and execute all the steps upto and including busybox devmem 0x4307b8000 32 0xfa950000. However setpci -s 0005:01:00.0 COMMAND=0x02 step fails with
Warning: No devices selected for “COMMAND=0x02”
is it predicated on correct PCIe cabling and RP booted or something else, any other diagnostic command to run on Orin EP standalone?

Instructions do not explicitly specify C5 Pinmux, is it already configured for Dev Kit by default or there is an extra step?

Crossover cable schematics in Jetson_AGX_Xavier_PCIe_Endpoint_Design_Guidelines.pdf


and “Note that the power rails of each connector have different net-names, so they are not
connected.” Does same schematic still apply for Orin Dev Kit EP, and if yes, does it read that all pins below AB11 are disconnected, or that only power pins are disconnected and some AB1-10 still connected PCI Express - Wikipedia?

Thanks.

Hi,

If this is Orin devkit, then you only need to change the ODMDATA inside p3701.conf.common…
You don’t need to do anything else. For example, pinmux change is not need, tegra_defconfig is also not needed.

I am actually not sure why you think you need to run so many extra steps to make it work…

Hi,

So pimux is already set, thanks. tegra_defconfig CONFIG_STRICT_DEVMEM=n seems to be needed otherwise busybox devmem 0x4307b8000 fails even with sudo. Not sure that there are any extra setup/configuration steps needed but maybe any diagnostic step, since there is an error at the last step setpci -s 0005:01:00.0 COMMAND=0x02? This command is to allow the PCIe endpoint to respond to PCIe memory accesses on the root port system.

Cables (swap board) to connect the two devices, Nvidia Jetson AGX Xavier PCIe Endpoint Design Guidelines (DA-09357) Figure 3., there is a note about power rails not being connected. Just to confirm no pins below A11, B11 are connected not even ground pins A4, B4?

Thanks.

I don’t get what you mean. The pin connections are listed in figure 3. Ground pins are connected.

Hi,

Meant ground to ground pins A4<->B4, A18<->B18, A49<->B49.

The setpci -s 0005:01:00.0 COMMAND=0x02 Warning: No devices selected for “COMMAND=0x02”, any insight or additional logging/debug command around it?

The setpci -s 0001:00:00.0 COMMAND=0x02 succeeds but on RP busybox devmen read from and write to 0x70000000 does not have any effect, read is always 0xffffffff doesn’t match Orin 0xfa950000 at 0x199a3a000, and write doesn’t overwrite the value.

On Orin EP:

dmesg | grep pci_epf_nv_test
[ 149.154801] pci_epf_nv_test pci_epf_nv_test.0: BAR0 RAM phys: 0x199a3a000

lspci -v
0001:00:00.0 PCI bridge: NVIDIA Corporation Device 229e (rev a1) (prog-if 00 [Normal decode])
Flags: fast devsel, IRQ 52
Bus: primary=00, secondary=01, subordinate=ff, sec-latency=0
I/O behind bridge: 00001000-00001fff [size=4K]
Memory behind bridge: 40000000-400fffff [size=1M]
Prefetchable memory behind bridge: [disabled]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [70] Express Root Port (Slot-), MSI 00
Capabilities: [b0] MSI-X: Enable- Count=1 Masked-
Capabilities: [100] Advanced Error Reporting
Capabilities: [148] Secondary PCI Express
Capabilities: [158] Physical Layer 16.0 GT/s <?> Capabilities: [17c] Lane Margining at the Receiver <?>
Capabilities: [190] L1 PM Substates
Capabilities: [1a0] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?> Capabilities: [2a0] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
Capabilities: [2d8] Data Link Feature <?> Capabilities: [2e4] Precision Time Measurement Capabilities: [2f0] Vendor Specific Information: ID=0004 Rev=1 Len=054 <?>
Capabilities: [358] Vendor Specific Information: ID=0006 Rev=0 Len=018 <?>
Kernel driver in use: pcieport

On RP:

lspci -v
0000:01:00.0 RAM memory: NVIDIA Corporation Device 0001
Flags: fast devsel, IRQ 255, NUMA node 0
Memory at 70000000 (32-bit, non-prefetchable) [disabled] [size=64K]
Memory at 3c0000000000 (64-bit, prefetchable) [disabled] [size=128K]
Memory at 70010000 (64-bit, non-prefetchable) [disabled] [size=4K]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
Capabilities: [70] Express Endpoint, MSI 00
Capabilities: [b0] MSI-X: Enable- Count=8 Masked-
Capabilities: [100] Advanced Error Reporting
Capabilities: [148] Secondary PCI Express
Capabilities: [168] Physical Layer 16.0 GT/s <?> Capabilities: [190] Lane Margining at the Receiver <?>
Capabilities: [1b8] Latency Tolerance Reporting
Capabilities: [1c0] L1 PM Substates
Capabilities: [1d0] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?> Capabilities: [2d0] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
Capabilities: [308] Data Link Feature <?> Capabilities: [314] Precision Time Measurement Capabilities: [320] Vendor Specific Information: ID=0003 Rev=1 Len=054 <?>
Capabilities: [388] Vendor Specific Information: ID=0006 Rev=0 Len=018 <?>

Thanks.

sorry that I am a little confused by current status. Could you only share the RP “lspci” result to me?

Hi,

The current status is that am able to get past setpci step with different Orin device name than the one in documentation, and RP can see some RAM memory device but is not able to read from remotely or write to it, value is always 0xffffffff. The RP lspci starts with “On RP:” above, there are no other Nvidia entries.

Thanks.

Hi,

No, I feel there is something wrong in the whole thread. Please just share the info I need first.
There is no need to explain by yourself. The log will tell the situation.

  1. lspci result on RP

  2. dmesg on RP and EP

Hi,

Here are the logs.

  1. lspci -v RP.lspci.log (17.5 KB) and EP.lspci.log (1.9 KB)
  2. dmesg RP.dmesg.log (87.1 KB) and EP.dmesg.log (72.7 KB)

Thanks.

Hi,

The document we shared is based on two Orin. RP side is also Orin.

Could you test with that setup too?

Hi,

The document under Hardware Requirements has “you can use any standard x86-64 PC that is running Linux”, and based on PCIe standardization and from some of other Xavier pcie posts, that is to be expected. Unfortunately don’t have another Orin to test with, but the RP PC works with other PCIe cards including even Nvidia GPU and can see Orin as EP device, just can’t read/write from/to it. Has Nvida confirmed that PCie EP workflow works between host PC and Orin dev kit?
Based on EP and RP logs are there any issues that stand out or any other logs we need to run?

Thanks.

Hi,

Below is our operation on x86 host and it is working.

lspci on RP

2022-10-28b3:00.0 RAM memory: NVIDIA Corporation Device 0001

2022-10-28        Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-

2022-10-28        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR+ <PERR- INTx-

2022-10-28        Interrupt: pin A routed to IRQ 11

2022-10-28        NUMA node: 0

2022-10-28        Region 0: Memory at fbe00000 (32-bit, non-prefetchable) [size=64K]

2022-10-28        Region 2: Memory at fbd00000 (64-bit, prefetchable) [size=128K]

2022-10-28        Region 4: Memory at fbe10000 (64-bit, non-prefetchable) [size=4K]

2022-10-28        Capabilities: [40] Power Management version 3

2022-10-28                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-)

2022-10-28                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-

2022-10-28        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+

2022-10-28                Address: 0000000000000000  Data: 0000

2022-10-28                Masking: 00000000  Pending: 00000000

2022-10-28        Capabilities: [70] Express (v2) Endpoint, MSI 00

2022-10-28                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited

2022-10-28                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W

2022-10-28                DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-

2022-10-28                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+

2022-10-28                        MaxPayload 256 bytes, MaxReadReq 512 bytes

2022-10-28                DevSta: CorrErr+ NonFatalErr+ FatalErr- UnsupReq+ AuxPwr+ TransPend-

2022-10-28                LnkCap: Port #0, Speed 16GT/s, Width x8, ASPM not supported

2022-10-28                        ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+

2022-10-28                LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk-

2022-10-28                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-

2022-10-28                LnkSta: Speed 8GT/s (downgraded), Width x8 (ok)

2022-10-28                        TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-

2022-10-28                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR+

2022-10-28                         10BitTagComp+ 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-

2022-10-28                         EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-

2022-10-28                         FRS- TPHComp- ExtTPHComp-

2022-10-28                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-

2022-10-28                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,

2022-10-28                         AtomicOpsCtl: ReqEn-

2022-10-28                LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-

2022-10-28                LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-

2022-10-28                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-

2022-10-28                         Compliance De-emphasis: -6dB

2022-10-28                LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+ EqualizationPhase1+

2022-10-28                         EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-

2022-10-28                         Retimer- 2Retimers- CrosslinkRes: Upstream Port

2022-10-28        Capabilities: [b0] MSI-X: Enable- Count=8 Masked-

2022-10-28                Vector table: BAR=2 offset=00000000

2022-10-28                PBA: BAR=2 offset=00010000

2022-10-28        Capabilities: [100 v2] Advanced Error Reporting

2022-10-28                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-

2022-10-28                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-

2022-10-28                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-

2022-10-28                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+

2022-10-28                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+

2022-10-28                AERCap: First Error Pointer: 14, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-

2022-10-28                        MultHdrRecCap+ MultHdrRecEn- TLPPfxPres- HdrLogCap-

2022-10-28                HeaderLog: 40000002 000002ff fbe00000 5a5aaa55

2022-10-28        Capabilities: [148 v1] Secondary PCI Express

2022-10-28                LnkCtl3: LnkEquIntrruptEn- PerformEqu-

2022-10-28                LaneErrStat: 0

2022-10-28        Capabilities: [168 v1] Physical Layer 16.0 GT/s <?>

2022-10-28        Capabilities: [190 v1] Lane Margining at the Receiver <?>

2022-10-28        Capabilities: [1b8 v1] Latency Tolerance Reporting

2022-10-28                Max snoop latency: 0ns

2022-10-28                Max no snoop latency: 0ns

2022-10-28        Capabilities: [1c0 v1] L1 PM Substates

2022-10-28                L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2- ASPM_L1.1- L1_PM_Substates+

2022-10-28                          PortCommonModeRestoreTime=60us PortTPowerOnTime=40us

2022-10-28                L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-

2022-10-28                           T_CommonMode=0us

2022-10-28                L1SubCtl2: T_PwrOn=10us

2022-10-28        Capabilities: [1d0 v1] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?>

2022-10-28        Capabilities: [2d0 v1] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>

2022-10-28        Capabilities: [308 v1] Data Link Feature <?>

2022-10-28        Capabilities: [314 v1] Precision Time Measurement

2022-10-28                PTMCap: Requester:+ Responder:- Root:-

2022-10-28                PTMClockGranularity: Unimplemented

2022-10-28                PTMControl: Enabled:- RootSelected:-

2022-10-28                PTMEffectiveGranularity: Unknown

2022-10-28        Capabilities: [320 v1] Vendor Specific Information: ID=0003 Rev=1 Len=054 <?>

2022-10-28        Capabilities: [388 v1] Vendor Specific Information: ID=0006 Rev=0 Len=018 <?>

2022-10-28

Operations:

2022-10-28root@22:/home/nvidia# setpci -s b3:00.0 COMMAND=0x02

2022-10-28root@22:/home/nvidia# devmem2 0xfbe00000 w 0x55aa5a5A

2022-10-28/dev/mem opened.

2022-10-28Memory mapped at address 0x7f7a7ccf0000.

2022-10-28Value at address 0xFBE00000 (0x7f7a7ccf0000): 0x55AA5A5A

2022-10-28Written 0x55AA5A5A; readback 0x55AA5A5A

2022-10-28root@22:/home/nvidia# devmem2 0xfbe00000 w 0x67891234

2022-10-28/dev/mem opened.

2022-10-28Memory mapped at address 0x7fc44cc48000.

2022-10-28Value at address 0xFBE00000 (0x7fc44cc48000): 0x55AA5A5A

2022-10-28Written 0x67891234; readback 0x67891234

2022-10-28root@22:/home/nvidia# devmem2 0xfbe00000 w 0x12345678

2022-10-28/dev/mem opened.

2022-10-28Memory mapped at address 0x7f0abf68d000.

2022-10-28Value at address 0xFBE00000 (0x7f0abf68d000): 0x67891234

2022-10-28Written 0x12345678; readback 0x12345678

2022-10-28root@22:/home/nvidia#

2022-10-28root@22:/home/nvidia#

2022-10-28root@22:/home/nvidia# devmem2 0xfbe00000

2022-10-28/dev/mem opened.

2022-10-28Memory mapped at address 0x7fc8a586e000.

2022-10-28Value at address 0xFBE00000 (0x7fc8a586e000): 0x12345678

Please make sure you disable the CONFIG_STRICT_DEVMEMM config in Orin kernel.

You can use below command to confirm.

root@Orin:/sys/kernel/config/pci_ep# zcat /proc/config.gz |grep CONFIG_STRICT_DEVMEMM
root@Orin:/sys/kernel/config/pci_ep#

Hi,

Thank you for the confirmation, Orin side was ok, missed setpci device name on RP side. devmem2 read and write works remotely both from RP and EP side.

Was trying to run a more realistic bandwidth/latency check with massive data transfer, and can see that there is a way by using RP /sys/kernel/debug/pcie-x/ (where x is one of 0,1,2,3) and cat write. Don’t see that subfolder on RP or other PCs, is it only enabled with CONFIG_PCIE_TEGRA_DW_DMA_TEST=y flag on Jetson? Or can be enabled on regular Linux PC also, closest flags by name seen in kernel .config are DMATEST=y or DMA_API_DEBUG=y?

Was looking also through JetPack kernel source samples, but don’t see direct standalone example with DMA and conversion from virtual to physical/bus address space. There is
kernel/nvidia/drivers/misc/tegra-pcie-ep-mem.c static int write(struct seq_file *s, void *data) and read(struct seq_file *s, void *data)
kernel/nvidia/drivers/pci/host/pcie-tegra-dw.c static int write(struct seq_file *s, void *data) and read(struct seq_file *s, void *data)
but don’t know how to populate these, s and data, and can it be used directly from main() reader on the RP side and main() writer on EP side? Any suggestions what could be a good base source code to run EP writes, RP reads?

Thanks.

Hi,

Changed .config flags
DMATEST=y
DMA_API_DEBUG=y
CONFIG_PCI_ENDPOINT_TEST=y
CONFIG_PCI_EPF_TEST=y

rebuild kernel, after boot, after setpci, in /sys/kernel/debug there is only
drwxr-xr-x 2 root root 0 Dec 31 1969 dma-api
drwxr-xr-x 2 root root 0 Dec 31 1969 dma_buf
no pcie-x, so it is either some other .config flags or pcie-x is specific to Jetson kernel.

Since debug/pcie-x may not be an option on PC RP, can any of
tegra-pcie-ep-mem.c or pcie-tegra-dw.c write() or write_ll() and read() or read_ll()
be potentially called from executable main() on RP and EP, if that is an option, what would be the correct code to populate s and data?

Thanks.

Hi,

Still trying to check the PCIe EP write PC RP read DMA speed. So far it looks like Orin RP can be set for testing using /sys/kernel/debug/pcie-x/cat write based on enabling CONFIG_PCIE_TEGRA_DW_DMA_TEST=y and some patches covered in The bandwidth of of virtual ethernet over PCIe between two xaviers is low. But for PC RP kernel that flag does not exist, there are some PCIe DMA test flags but they don’t enable /sys/kernel/debug/pci-x on RP, also /pci/dma does not show any NV devices. Based on AGX Endpoint PCIe DMA speed test should be doable but it isn’t not clear what needs to be changed in which file and how to build on a PC?

Is there some guidance how to modify the kernel/nvidia/drivers/pci/dwc/pcie-tegra.c that has #ifdef CONFIG_PCIE_TEGRA_DW_DMA_TEST and build in order to eventually enable DMA write from virtual user space to physical/bus address on EP (Orin) and DMA read from physical address (000:01:00.0 NV RAM memory on RP) to user space on RP (PC) done from user space (main()), or does it require writing custom driver? Or maybe using mmap or such from user space.

Thanks

Hi,

Using mmap and memcpy from main(), write from EP, read from PC RP, the results are unexpected, received data speed vs. reported send speed is very different.

EP->RP, 16KB 1000 times, 16MB overall
EP reports 0.0005s, 16/0.0005=32GB/s=240Gb/s
RP reports 2.2s, 16/2.2=7.4MB/s=60Mb/s

When data on RP is observed, it comes in blocks with jitter upto 0.5s so clearly most of the data is dropped or overwritten. On EP, 32GB/s is higher than theoretical for x1 reported by lspci LnkSta. Also there is a limit of 16384 bytes mmap size, if 32768 is used, Orin segfaults and restarts, not sure how to increase the limit?

RP->EP
EP reports 0.0005s, 16/0.0005=32GB/s=240Gb/s
RP reports 0.7s, 16/0.7=22MB/s=182Mb/s

When data on RP is observed, 182Mb/s is lower than using tvnet, and again EP 32GB/s is more than theoretical. Since there are similarly divergent results reported Performance issues of data transmission speed in PCIe EP mode, is mmap a viable option for data transfers over PCIe EP from user space to physical address space mapped for PCIe EP (RAM), and are these results due to temporary instability of PCIe driver, or even when fixed, will not be performant? If it is the latter, is writing custom kernel driver Custom Endpoint Function Driver the only option left or there are some other shortcuts there for using DMA from user space?

Thanks.

1 Like

Hi,

Based on GPCDMA memory to memory low performance it looks like user space DMA would not allow for speeds greater than 111MB/s.

Back to the PCIe/DMA test kernel driver above. EP side usage is covered in The bandwidth of of virtual ethernet over PCIe between two xaviers is low - #19 by WayneWWW, but PC RP side (also Ubuntu 20.04) needs to be build, and there are some posts that imply it is buildable, but instructions are little unclear if not conflicting.

Building kernel/nvidia/drivers/misc/tegra-pcie-ep-mem.c is implied in Xavier AGX PCIe End-Point : access to dma_alloc_coherent return in CUDA kernel and How another CPU communicate with Xavier through PCIE? (Solved) - #8 by guo.tang.
Building kernel/nvidia/drivers/pci/endpoint/functions/pci-epf-nv-test.c implied in AGX Endpoint PCIe DMA speed - #5 by jack_lan, but it is not clear what needs to be modified.

Trying to build tegra-pci-dma-test.c from kernel_src/nvbuild.sh by first changing tegra_defconfig to include CONFIG_TEGRA_PCIE_DMA_TEST=y. Only tegra-pci-dma-test.o gets created but not .ko, and also tegra-pcie-ep-mem.o does not get created. In the kernel/nvidia/drivers/misc/Makefile there is a ifdef CONFIG_ARCH_TEGRA_19x_SOC around obj-$(CONFIG_TEGRA_PCIE_EP_MEM).

If ifdef is commented out, build breaks as follows

make[1]: Leaving directory '/home/me/kernel_src/kernel_out'
  CALL    /home/me/kernel_src/kernel/kernel-5.10/scripts/atomic/check-atomics.sh
  CALL    /home/me/kernel_src/kernel/kernel-5.10/scripts/checksyscalls.sh
  CHK     include/generated/compile.h
  CC      drivers/misc/tegra-pcie-ep-mem.o
/home/me/kernel_src/kernel/nvidia/drivers/misc/tegra-pcie-ep-mem.c: In function ‘init_debugfs’:
/home/me/kernel_src/kernel/nvidia/drivers/misc/tegra-pcie-ep-mem.c:731:4: error: void value not ignored as it ought to be
  731 |  d = debugfs_create_x64("src", 0644, ep->debugfs,
/home/me/kernel_src/kernel/nvidia/drivers/misc/tegra-pcie-ep-mem.c:736:4: error: void value not ignored as it ought to be
  736 |  d = debugfs_create_x64("dst", 0644, ep->debugfs,
/home/me/kernel_src/kernel/nvidia/drivers/misc/tegra-pcie-ep-mem.c:741:4: error: void value not ignored as it ought to be
  741 |  d = debugfs_create_x32("size", 0644, ep->debugfs,
/home/me/kernel_src/kernel/nvidia/drivers/misc/tegra-pcie-ep-mem.c:746:4: error: void value not ignored as it ought to be
  746 |  d = debugfs_create_x8("channel", 0644, ep->debugfs,
make[3]: *** [/home/me/kernel_src/kernel/kernel-5.10/scripts/Makefile.build:281: drivers/misc/tegra-pcie-ep-mem.o] Error 1
make[2]: *** [/home/me/kernel_src/kernel/kernel-5.10/scripts/Makefile.build:498: drivers/misc] Error 2
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [/home/me/kernel_src/kernel/kernel-5.10/Makefile:1854: drivers] Error 2
make: *** [Makefile:213: __sub-make] Error 2

If kernel/nvidia/drivers/misc/Makefile is changed to

obj-m +=tegra-pcie-dma-test.o
all:
	make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules

and invoked standalone, it has following errors

CPATH=/home/me/kernel_src/kernel/nvidia/include make
make -C /lib/modules/5.4.0-126-generic/build M=/home/me/me/kernel_src/kernel/nvidia/drivers/misc modules
make[1]: warning: jobserver unavailable: using -j1.  Add '+' to parent make rule.
make[1]: Entering directory '/usr/src/linux-headers-5.4.0-126-generic'
  CC [M]  /home/me/kernel_src/kernel/nvidia/drivers/misc/tegra-pcie-dma-test.o
  Building modules, stage 2.
  MODPOST 1 modules
ERROR: "tegra_pcie_edma_initialize" [/home/me/kernel_src/kernel/nvidia/drivers/misc/tegra-pcie-dma-test.ko] undefined!
ERROR: "tegra_pcie_edma_submit_xfer" [/home/me/kernel_src/kernel/nvidia/drivers/misc/tegra-pcie-dma-test.ko] undefined!
ERROR: "tegra_pcie_edma_deinit" [/home/me/kernel_src/kernel/nvidia/drivers/misc/tegra-pcie-dma-test.ko] undefined!
make[2]: *** [scripts/Makefile.modpost:94: __modpost] Error 1
make[1]: *** [Makefile:1675: modules] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-5.4.0-126-generic'
make: *** [Makefile:3: all] Error 2

If Makefile is changed to

obj-m +=tegra-pcie-ep-mem.o
all:
	make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules

and invoked standalone, .ko is created, once loaded with insmod, modinfo lists it but lsmod lists as unused and don’t see anything new under /sys/kernel/debug.

Not sure which of these modules needs to be build and how to properly change make files, could you please provide some guidance?

Thanks.

Hi,

Is there possibly a dependency on the PC RP OS or Cuda version, using 20.04 and 11.4? Based on a somewhat related project only in a sense of exposing GPU kernel driver DMA to another device and to user space app, seems to have compatibility with earlier Ubuntu but incompatibility with newer, GPUDirect RDMA on NVIDIA Jetson AGX Xavier driver build issue. Following the prerequisite steps including nvidia-dkms- and Building on an x86 Linux PC, to Run on That PC, on one system there are some missing dependencies:

./build-for-pc-native.sh
./nvidia-ko-to-module-symvers "/lib/modules/5.15.0-57-generic/updates/dkms/nvidia.ko" "Module.symvers"
make -C "/lib/modules/5.15.0-57-generic/build" "M=$PWD" "modules"
make[1]: Entering directory '/usr/src/linux-headers-5.15.0-57-generic'
  MODPOST /home/me/jetson-rdma-picoevb/kernel-module/Module.symvers
ERROR: modpost: "nvidia_p2p_get_pages" [/home/me/jetson-rdma-picoevb/kernel-module/picoevb-rdma.ko] undefined!
ERROR: modpost: "nvidia_p2p_dma_map_pages" [/home/me/jetson-rdma-picoevb/kernel-module/picoevb-rdma.ko] undefined!
ERROR: modpost: "nvidia_p2p_dma_unmap_pages" [/home/me/jetson-rdma-picoevb/kernel-module/picoevb-rdma.ko] undefined!
ERROR: modpost: "nvidia_p2p_put_pages" [/home/me/jetson-rdma-picoevb/kernel-module/picoevb-rdma.ko] undefined!
ERROR: modpost: "nvidia_p2p_free_page_table" [/home/me/jetson-rdma-picoevb/kernel-module/picoevb-rdma.ko] undefined!
make[2]: *** [scripts/Makefile.modpost:133: /home/me/jetson-rdma-picoevb/kernel-module/Module.symvers] Error 1
make[2]: *** Deleting file '/home/me/jetson-rdma-picoevb/kernel-module/Module.symvers'
make[1]: *** [Makefile:1817: modules] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-5.15.0-57-generic'
make: *** [Makefile:18: modules] Error 2

On another systems, something with kernel symbol file:

./build-for-pc-native.sh
./nvidia-ko-to-module-symvers "/lib/modules/5.4.0-126-generic/kernel/drivers/video/nvidia.ko" "Module.symvers"
make -C "/lib/modules/5.4.0-126-generic/build" "M=$PWD" "modules"
make[1]: Entering directory '/usr/src/linux-headers-5.4.0-126-generic'
  CC [M]  /home/me/jetson-rdma-picoevb/kernel-module/picoevb-rdma.o
  Building modules, stage 2.
  MODPOST 1 modules
FATAL: parse error in symbol dump file
make[2]: *** [scripts/Makefile.modpost:94: __modpost] Error 1
make[1]: *** [Makefile:1675: modules] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-5.4.0-126-generic'
make: *** [Makefile:18: modules] Error 2

Can not get either tegra-pcie-ep-mem.c for remote Orin or picoevb-rdm.c for local GPU to build and list anything under /dev/. Any guidance on either one?

Thanks.