Hi,
I’m testing the PCIe communication between two Xavier(Endpoint & Rootport system), in that I’m facing an issue.
Issue: Unable to access the Rootport system and getting the error message continuously in Rootport systems
[ 83.280350] pcieport 0005:00:00.0: AER: Corrected error received: id=0000
[ 83.280375] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0000(Receiver ID)
[ 83.280595] pcieport 0005:00:00.0: device [10de:1ad0] error status/mask=00000001/0000e000
[ 83.280773] pcieport 0005:00:00.0: [ 0] Receiver Error (First)
I followed the procedure from the following link to check the PCIe endpoint in Xavier board
https://docs.nvidia.com/jetson/l4t/index.html#page/Tegra%2520Linux%2520Driver%2520Package%2520Development%2520Guide%2Fxavier_PCIe_endpoint_mode.html%23wwpID0E0WD0HA
Step1: I flashed R32.4.3 with ODMDATA=0x09191000 in one Xvaier board for the PCIe endpoint system
Step2: In another Xavier board, I flashed the same R32.4.3 with ODMDATA=0x09190000 for the PCIe root port system
Step3: Connected PCIe cable between two Xavier board. In this cable TX & RX are swapped and we removed 12V and 3.3V in one end.
Step4: Booted the endpoint Jetson system and checked the clock mux selects NVHS_SLVS_REFCLK_P/N for endpoint system
root@nvep-desktop:/home/nvep# grep 253 /sys/kernel/debug/gpio
gpio-253 ( |pex-refclk-sel-high ) out hi
Step5: Followed below commands to enable the PCIe endpoint mode
root@nvep-desktop:/home/nvep# cd /sys/kernel/config/pci_ep/
root@nvep-desktop:/sys/kernel/config/pci_ep# mkdir functions/pci_epf_nv_test/func1
root@nvep-desktop:/sys/kernel/config/pci_ep# echo 0x10de > functions/pci_epf_nv_test/func1/vendorid
root@nvep-desktop:/sys/kernel/config/pci_ep# echo 0x0001 > functions/pci_epf_nv_test/func1/deviceid
root@nvep-desktop:/sys/kernel/config/pci_ep# ln -s functions/pci_epf_nv_test/func1 controllers/141a0000.pcie_ep/
root@nvep-desktop:/sys/kernel/config/pci_ep# echo 1 > controllers/141a0000.pcie_ep/start
Step5: Booted the Rootport Jetson system and getting the following error continuously and only sometimes we are able to access the device through ssh
[ 83.280350] pcieport 0005:00:00.0: AER: Corrected error received: id=0000
[ 83.280375] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0000(Receiver ID)
[ 83.280595] pcieport 0005:00:00.0: device [10de:1ad0] error status/mask=00000001/0000e000
[ 83.280773] pcieport 0005:00:00.0: [ 0] Receiver Error (First)
[ 83.371942] pcieport 0005:00:00.0: AER: Corrected error received: id=0000
[ 83.371963] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0000(Receiver ID)
[ 83.372183] pcieport 0005:00:00.0: device [10de:1ad0] error status/mask=00000001/0000e000
[ 83.372362] pcieport 0005:00:00.0: [ 0] Receiver Error (First)
[ 83.382454] pcieport 0005:00:00.0: AER: Corrected error received: id=0000
[ 83.382473] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0000(Receiver ID)
[ 83.382685] pcieport 0005:00:00.0: device [10de:1ad0] error status/mask=00000001/0000e000
[ 83.382848] pcieport 0005:00:00.0: [ 0] Receiver Error (First)
root@nvrp-desktop:/home/nvrp# setpci -s 0005:01:00.0 COMMAND=0x02
root@nvrp-desktop:/home/nvrp# lspci -v
0001:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad2 (rev a1) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 35
Bus: primary=00, secondary=01, subordinate=ff, sec-latency=0
I/O behind bridge: 00000000-00000fff
Memory behind bridge: 40000000-400fffff
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [70] Express Root Port (Slot-), MSI 00
Capabilities: [b0] MSI-X: Enable- Count=1 Masked-
Capabilities: [100] Advanced Error Reporting
Capabilities: [148] #19
Capabilities: [158] #26
Capabilities: [17c] #27
Capabilities: [190] L1 PM Substates
Capabilities: [1a0] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?> Capabilities: [2a0] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
Capabilities: [2d8] #25
Capabilities: [2e4] Precision Time Measurement
Capabilities: [2f0] Vendor Specific Information: ID=0004 Rev=1 Len=054 <?>
Kernel driver in use: pcieport0001:01:00.0 SATA controller: Marvell Technology Group Ltd. Device 9171 (rev 13) (prog-if 01 [AHCI 1.0])
Subsystem: Marvell Technology Group Ltd. Device 9171
Flags: bus master, fast devsel, latency 0, IRQ 564
I/O ports at 100010 [size=8]
I/O ports at 100020 [size=4]
I/O ports at 100018 [size=8]
I/O ports at 100024 [size=4]
I/O ports at 100000 [size=16]
Memory at 1230010000 (32-bit, non-prefetchable) [size=512]
Expansion ROM at 1230000000 [disabled] [size=64K]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit-
Capabilities: [70] Express Legacy Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting
Kernel driver in use: ahci0005:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad0 (rev a1) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 39
Bus: primary=00, secondary=01, subordinate=ff, sec-latency=0
Memory behind bridge: 40000000-401fffff
Prefetchable memory behind bridge: 0000001c00000000-0000001c000fffff
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
Capabilities: [70] Express Root Port (Slot-), MSI 00
Capabilities: [b0] MSI-X: Enable- Count=8 Masked-
Capabilities: [100] Advanced Error Reporting
Capabilities: [148] #19
Capabilities: [168] #26
Capabilities: [190] #27
Capabilities: [1c0] L1 PM Substates
Capabilities: [1d0] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?> Capabilities: [2d0] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
Capabilities: [308] #25
Capabilities: [314] Precision Time Measurement
Capabilities: [320] Vendor Specific Information: ID=0004 Rev=1 Len=054 <?>
Kernel driver in use: pcieport0005:01:00.0 RAM memory: NVIDIA Corporation Device 0001
Flags: fast devsel, IRQ 255
Memory at 1f40100000 (32-bit, non-prefetchable) [disabled] [size=64K]
Memory at 1c00000000 (64-bit, prefetchable) [disabled] [size=128K]
Memory at 1f40000000 (64-bit, non-prefetchable) [disabled] [size=1M]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit-
Capabilities: [70] Express Endpoint, MSI 00
Capabilities: [b0] MSI-X: Enable- Count=8 Masked-
Capabilities: [100] Advanced Error Reporting
Capabilities: [148] #19
Capabilities: [168] #26
Capabilities: [190] #27
Capabilities: [1b8] Latency Tolerance Reporting
Capabilities: [1c0] L1 PM Substates
Capabilities: [1d0] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?> Capabilities: [2d0] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
Capabilities: [308] #25
Capabilities: [314] Precision Time Measurement
Capabilities: [320] Vendor Specific Information: ID=0003 Rev=1 Len=054 <?>
Step6: Writing shared memory in endpoint & reading it in the root port system and viceversa
root@nvep-desktop:~# dmesg|grep pci_epf_nv_test
[ 46.499526] pci_epf_nv_test pci_epf_nv_test.0: BAR0 RAM phys: 0x436614000
[ 46.499535] pci_epf_nv_test pci_epf_nv_test.0: BAR0 RAM IOVA: 0xffff0000
[ 46.499559] pci_epf_nv_test pci_epf_nv_test.0: BAR0 RAM virt: 0xffffff8008007000
[Writting in endpoint]
root@nvep-desktop:~# busybox devmem 0x436614000 32 0x12345678
[Reading in rootport]
root@nvrp-desktop:~# busybox devmem 0x1f40100000
0x12345678
[Writting in rootport]
root@nvrp-desktop:~# busybox devmem 0x1f40100004 32 0x09876543
[Reading in endpoint]
root@nvep-desktop:~# busybox devmem 0x436614004
0x09876543
Note:
-
Enabled CONFIG_PCIEASPM_PERFORMANCE=y in both Jetson Endpoint & Rootport system, still the issue is coming.
-
Sometimes only we can able to access the root port. Remaining times we are unable to access the root port, the error message only coming continuously
-
Tested two different PCIe cable and getting the same results:
1.removed only 12V Power
2.removed both 3.3V and 12V Power
Hope for the better guidance
Regards,
Bala