PCI detection problems during system booting

Hi all,

we have problems with the detection of pcie during system booting. For testing this, we check if the pcie is detected and if the XDMAs are loaded. In the wrong case (2 / 3 each 10 reboots, they usually happens in consecutive reboots), we obtain this logs:

ubuntu@tegra-ubuntu:~$ ls /dev/xdma*
ls: cannot access '/dev/xdma*': No such file or directory

ubuntu@tegra-ubuntu:~$ lspci -vvv
00:01.0 PCI bridge: NVIDIA Corporation Device 10e5 (rev a1) (prog-if 00 [Normal decode])
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 385
        Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
        I/O behind bridge: 0000f000-00000fff
        Memory behind bridge: 50100000-502fffff
        Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff
        Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
        BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
                PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
        Capabilities: <access denied>
        Kernel driver in use: pcieport

01:00.0 Serial controller: Xilinx Corporation Device 8024 (rev ff) (prog-if ff)
        !!! Unknown header type 7f
        Kernel modules: xdma

PCI dmesg logs:

ubuntu@tegra-ubuntu:~$ dmesg | grep -i pci
                   PCI I/O : 0xffffffbffae00000 - 0xffffffbffbe00000   (    16 MB)
[    0.144089] node /plugin-manager/fragment-500-pcie-config match with board >=3310-1000-500
[    0.144805] node /plugin-manager/fragment-500-e3325-pcie match with board >=3310-1000-500
[    0.268080] GPIO line 460 (wifi-over-pcie) hogged as output/low
[    0.271588] iommu: Adding device 10003000.pcie-controller to group 48
[    0.410598] PCI: CLS 0 bytes, default 128
[    6.301253] tegra-pcie 10003000.pcie-controller: 4x1, 1x1 configuration
[    6.311177] tegra-pcie 10003000.pcie-controller: PCIE: Enable power rails
[    6.319423] tegra-pcie 10003000.pcie-controller: probing port 0, using 4 lanes
[    6.332028] tegra-pcie 10003000.pcie-controller: probing port 2, using 1 lanes
[    6.784200] tegra-pcie 10003000.pcie-controller: link 2 down, retrying
[    7.217681] tegra-pcie 10003000.pcie-controller: link 2 down, retrying
[    7.647025] tegra-pcie 10003000.pcie-controller: link 2 down, retrying
[    7.655032] tegra-pcie 10003000.pcie-controller: link 2 down, ignoring
[    7.661926] tegra-pcie 10003000.pcie-controller: PCI host bridge to bus 0000:00
[    7.669265] pci_bus 0000:00: root bus resource [mem 0x50100000-0x57ffffff]
[    7.676168] pci_bus 0000:00: root bus resource [mem 0x58000000-0x7fffffff pref]
[    7.676174] pci_bus 0000:00: root bus resource [bus 00-ff]
[    7.676177] pci_bus 0000:00: root bus resource [io  0x1000-0xffff]
[    7.676204] pci 0000:00:01.0: [10de:10e5] type 01 class 0x060400
[    7.676295] pci 0000:00:01.0: PME# supported from D0 D1 D2 D3hot D3cold
[    7.676636] pci 0000:00:01.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[    7.676769] pci 0000:01:00.0: [10ee:8024] type 00 class 0x070001
[    7.676808] pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x000fffff]
[    7.676818] pci 0000:01:00.0: reg 0x14: [mem 0x00000000-0x0000ffff]
[    7.677121] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
[    7.677151] pci 0000:00:01.0: BAR 8: assigned [mem 0x50100000-0x502fffff]
[    7.677154] pci 0000:01:00.0: BAR 0: assigned [mem 0x50100000-0x501fffff]
[    7.677160] pci 0000:01:00.0: BAR 1: assigned [mem 0x50200000-0x5020ffff]
[    7.677166] pci 0000:00:01.0: PCI bridge to [bus 01]
[    7.677172] pci 0000:00:01.0:   bridge window [mem 0x50100000-0x502fffff]
[    7.677245] pcieport 0000:00:01.0: enabling device (0000 -> 0002)
[    7.677325] pcieport 0000:00:01.0: Signaling PME through PCIe PME interrupt
[    7.677327] pci 0000:01:00.0: Signaling PME through PCIe PME interrupt
[    7.677333] pcie_pme 0000:00:01.0:pcie01: service driver pcie_pme loaded
[    7.677408] aer 0000:00:01.0:pcie02: service driver aer loaded
[    7.677587] tegra-pcie 10003000.pcie-controller: speed change : Gen-1 -> Gen-2
[    7.740385] pcieport 0000:00:01.0: AER: Uncorrected (Non-Fatal) error received: id=0020
[    7.740398] pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0008(Requester ID)
[    7.740400] pcieport 0000:00:01.0:   device [10de:10e5] error status/mask=00004000/00000000
[    7.740403] pcieport 0000:00:01.0:    [14] Completion Timeout     (First)
[    7.740410] pcieport 0000:00:01.0: broadcast error_detected message
[    7.819059] pcieport 0000:00:01.0: AER: Device recovery failed
[    7.819063] pcieport 0000:00:01.0: AER: Uncorrected (Non-Fatal) error received: id=0020
[    7.819076] pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0008(Requester ID)
[    7.819078] pcieport 0000:00:01.0:   device [10de:10e5] error status/mask=00004000/00000000
[    7.819081] pcieport 0000:00:01.0:    [14] Completion Timeout     (First)
[    7.819087] pcieport 0000:00:01.0: broadcast error_detected message
[    7.819089] pcieport 0000:00:01.0: AER: Device recovery failed
[    7.819091] pcieport 0000:00:01.0: AER: Multiple Uncorrected (Non-Fatal) error received: id=0020
[    7.819102] pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0008(Requester ID)
[    7.819104] pcieport 0000:00:01.0:   device [10de:10e5] error status/mask=00004000/00000000
[    7.819106] pcieport 0000:00:01.0:    [14] Completion Timeout     (First)
[    7.819112] pcieport 0000:00:01.0: broadcast error_detected message
[    7.819113] pcieport 0000:00:01.0: AER: Device recovery failed
[    7.819115] pcieport 0000:00:01.0: AER: Uncorrected (Non-Fatal) error received: id=0020
[    7.819125] pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0008(Requester ID)
[    7.819127] pcieport 0000:00:01.0:   device [10de:10e5] error status/mask=00004000/00000000
[    7.819129] pcieport 0000:00:01.0:    [14] Completion Timeout     (First)
[    7.819135] pcieport 0000:00:01.0: broadcast error_detected message
[    7.819136] pcieport 0000:00:01.0: AER: Device recovery failed
[    7.819138] pcieport 0000:00:01.0: AER: Uncorrected (Non-Fatal) error received: id=0020
[    7.819149] pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0008(Requester ID)
[    7.819151] pcieport 0000:00:01.0:   device [10de:10e5] error status/mask=00004000/00000000
[    7.819152] pcieport 0000:00:01.0:    [14] Completion Timeout     (First)
[    7.819158] pcieport 0000:00:01.0: broadcast error_detected message
[    7.819159] pcieport 0000:00:01.0: AER: Device recovery failed
[  893.416281] pcieport 0000:00:01.0: AER: Uncorrected (Non-Fatal) error received: id=0020
[  893.574946] pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0008(Requester ID)
[  893.586809] pcieport 0000:00:01.0:   device [10de:10e5] error status/mask=00004000/00000000
[  893.595278] pcieport 0000:00:01.0:    [14] Completion Timeout     (First)
[  893.602159] pcieport 0000:00:01.0: broadcast error_detected message
[  893.608484] pcieport 0000:00:01.0: AER: Device recovery failed
[  893.614402] pcieport 0000:00:01.0: AER: Multiple Uncorrected (Non-Fatal) error received: id=0020
[  893.623259] pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0008(Requester ID)
[  893.635067] pcieport 0000:00:01.0:   device [10de:10e5] error status/mask=00004000/00000000
[  893.643494] pcieport 0000:00:01.0:    [14] Completion Timeout     (First)
[  893.650347] pcieport 0000:00:01.0: broadcast error_detected message
[  893.656686] pcieport 0000:00:01.0: AER: Device recovery failed
[  893.662618] pcieport 0000:00:01.0: AER: Uncorrected (Non-Fatal) error received: id=0020
[  893.670712] pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0008(Requester ID)
[  893.682523] pcieport 0000:00:01.0:   device [10de:10e5] error status/mask=00004000/00000000
[  893.690948] pcieport 0000:00:01.0:    [14] Completion Timeout     (First)
[  893.697826] pcieport 0000:00:01.0: broadcast error_detected message
[  893.704205] pcieport 0000:00:01.0: AER: Device recovery failed
[  893.710099] pcieport 0000:00:01.0: AER: Uncorrected (Non-Fatal) error received: id=0020
[  893.718265] pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0008(Requester ID)
[  893.730120] pcieport 0000:00:01.0:   device [10de:10e5] error status/mask=00004000/00000000
[  893.738504] pcieport 0000:00:01.0:    [14] Completion Timeout     (First)
[  893.745312] pcieport 0000:00:01.0: broadcast error_detected message
[  893.751591] pcieport 0000:00:01.0: AER: Device recovery failed
[  893.757442] pcieport 0000:00:01.0: AER: Uncorrected (Non-Fatal) error received: id=0020
[  893.765474] pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0008(Requester ID)
[  893.777211] pcieport 0000:00:01.0:   device [10de:10e5] error status/mask=00004000/00000000
[  893.785565] pcieport 0000:00:01.0:    [14] Completion Timeout     (First)
[  893.792361] pcieport 0000:00:01.0: broadcast error_detected message
[  893.798635] pcieport 0000:00:01.0: AER: Device recovery failed
[  893.804474] pcieport 0000:00:01.0: AER: Uncorrected (Non-Fatal) error received: id=0020
[  893.812490] pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0008(Requester ID)
[  893.824230] pcieport 0000:00:01.0:   device [10de:10e5] error status/mask=00004000/00000000
[  893.832590] pcieport 0000:00:01.0:    [14] Completion Timeout     (First)
[  893.839399] pcieport 0000:00:01.0: broadcast error_detected message
[  893.845675] pcieport 0000:00:01.0: AER: Device recovery failed
[  893.851513] pcieport 0000:00:01.0: AER: Uncorrected (Non-Fatal) error received: id=0020
[  893.859528] pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0008(Requester ID)
[  893.871265] pcieport 0000:00:01.0:   device [10de:10e5] error status/mask=00004000/00000000
[  893.879618] pcieport 0000:00:01.0:    [14] Completion Timeout     (First)
[  893.886417] pcieport 0000:00:01.0: broadcast error_detected message
[  893.892689] pcieport 0000:00:01.0: AER: Device recovery failed
[  893.898529] pcieport 0000:00:01.0: AER: Uncorrected (Non-Fatal) error received: id=0020
[  893.906547] pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0008(Requester ID)
[  893.918353] pcieport 0000:00:01.0:   device [10de:10e5] error status/mask=00004000/00000000
[  893.926710] pcieport 0000:00:01.0:    [14] Completion Timeout     (First)
[  893.933505] pcieport 0000:00:01.0: broadcast error_detected message
[  893.939776] pcieport 0000:00:01.0: AER: Device recovery failed
[  893.945618] pcieport 0000:00:01.0: AER: Uncorrected (Non-Fatal) error received: id=0020
[  893.953635] pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0008(Requester ID)
[  893.965378] pcieport 0000:00:01.0:   device [10de:10e5] error status/mask=00004000/00000000
[  893.973732] pcieport 0000:00:01.0:    [14] Completion Timeout     (First)
[  893.980527] pcieport 0000:00:01.0: broadcast error_detected message
[  893.986796] pcieport 0000:00:01.0: AER: Device recovery failed
[  893.992632] pcieport 0000:00:01.0: AER: Uncorrected (Non-Fatal) error received: id=0020
[  894.000647] pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0008(Requester ID)
[  894.012386] pcieport 0000:00:01.0:   device [10de:10e5] error status/mask=00004000/00000000
[  894.020739] pcieport 0000:00:01.0:    [14] Completion Timeout     (First)
[  894.027536] pcieport 0000:00:01.0: broadcast error_detected message
[  894.033805] pcieport 0000:00:01.0: AER: Device recovery failed
[  894.039643] pcieport 0000:00:01.0: AER: Uncorrected (Non-Fatal) error received: id=0020
[  894.047657] pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0008(Requester ID)
[  894.059400] pcieport 0000:00:01.0:   device [10de:10e5] error status/mask=00004000/00000000
[  894.067753] pcieport 0000:00:01.0:    [14] Completion Timeout     (First)
[  894.074551] pcieport 0000:00:01.0: broadcast error_detected message
[  894.080822] pcieport 0000:00:01.0: AER: Device recovery failed
[  894.086661] pcieport 0000:00:01.0: AER: Uncorrected (Non-Fatal) error received: id=0020
[  894.094676] pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0008(Requester ID)
[  894.106411] pcieport 0000:00:01.0:   device [10de:10e5] error status/mask=00004000/00000000
[  894.114764] pcieport 0000:00:01.0:    [14] Completion Timeout     (First)
[  894.121561] pcieport 0000:00:01.0: broadcast error_detected message
[  894.127831] pcieport 0000:00:01.0: AER: Device recovery failed
[  894.133671] pcieport 0000:00:01.0: AER: Uncorrected (Non-Fatal) error received: id=0020
[  894.141685] pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0008(Requester ID)
[  894.153462] pcieport 0000:00:01.0:   device [10de:10e5] error status/mask=00004000/00000000
[  894.161823] pcieport 0000:00:01.0:    [14] Completion Timeout     (First)
[  894.168620] pcieport 0000:00:01.0: broadcast error_detected message
[  894.174888] pcieport 0000:00:01.0: AER: Device recovery failed
[  894.180729] pcieport 0000:00:01.0: AER: Multiple Uncorrected (Non-Fatal) error received: id=0020
[  894.189525] pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0008(Requester ID)
[  894.201262] pcieport 0000:00:01.0:   device [10de:10e5] error status/mask=00004000/00000000
[  894.209619] pcieport 0000:00:01.0:    [14] Completion Timeout     (First)
[  894.216416] pcieport 0000:00:01.0: broadcast error_detected message
[  894.222686] pcieport 0000:00:01.0: AER: Device recovery failed

xdma dmesg logs:

ubuntu@tegra-ubuntu:~$ dmesg | grep -i xdma
[    7.730087] xdma:xdma_mod_init: Xilinx XDMA Reference Driver xdma v2020.1.8
[    7.730090] xdma:xdma_mod_init: desc_blen_max: 0xfffffff/268435455, timeout: h2c 10 c2h 10 sec.
[    7.730167] xdma:xdma_device_open: xdma device 0000:01:00.0, 0xffffffc1e1179800.
[    7.730330] xdma:map_single_bar: BAR0 at 0x50100000 mapped at 0xffffff8005a80000, length=1048576(/1048576)
[    7.750978] xdma:map_single_bar: BAR1 at 0x50200000 mapped at 0xffffff8005a60000, length=65536(/65536)
[    7.771023] xdma:map_bars: Failed to detect XDMA config BAR
[    7.819030] xdma:probe_one: pdev 0xffffffc1e1179800, err -22.
[    7.819032] xdma:xpdev_free: xpdev 0xffffffc07026c000, destroy_interfaces, xdev 0x          (null).
[    7.819034] xdma:xpdev_free: xpdev 0xffffffc07026c000, xdev 0x          (null) xdma_device_close.
[    7.819043] xdma: probe of 0000:01:00.0 failed with error -22

Do you know what could be the error?

Thanks and regards,
Antonio

Do note that for the fully verbose “lspci -vvv” much information will be lost unless you use sudo.

I did not read everything above, but in the case of an FPGA, it is very common for boot to progress faster than the FPGA can boot. As a simple matter of timing the FPGA is probably just not ready at the time the boot gets to where it is testing for the PCI devices. Are you able to fully boot the FPGA (even if only for testing) prior to powering up or resetting power on the Jetson? I strongly suspect that the setup would not work at all if something else were wrong.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.