Hello,
I am trying to get a PCIe card, that works perfectly on an Intel x86 machine, to be recognized by the NVIDIA TX2 development kit but so far have had luck. I am running the latest version of Jetpack 4.2.
The card has two devices, each device is a PCIe x2 Gen 2 device.
Kernel: 4.9.140
L4T: 32.2
Rootfs: 18.04.2 LTS (Bionic Beaver)
lspci output on x86
root@x86# lspci -vvv
08:00.0 Ethernet controller: Microsemi / PMC / IDT Device 80e8 (rev 01)
Subsystem: Microsemi / PMC / IDT Device 0001
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 128
Region 0: Memory at d0000000 (64-bit, non-prefetchable)
Region 2: Memory at f0100000 (64-bit, prefetchable)
Region 4: Memory at f0200000 (64-bit, prefetchable)
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable+ Count=32/32 Maskable+ 64bit+
Address: 00000000fee00418 Data: 0000
Masking: fffffffc Pending: 00000000
Capabilities: [70] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 25.000W
DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop- FLReset-
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x2, ASPM L0s L1, Exit Latency L0s <512ns, L1 <2us
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x2, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Via message/WAKE#
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [100 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
Capabilities: [148 v1] Transaction Processing Hints
Device specific mode supported
No steering table available
Capabilities: [1d4 v1] Latency Tolerance Reporting
Max snoop latency: 0ns
Max no snoop latency: 0ns
Capabilities: [1dc v1] L1 PM Substates
L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
PortCommonModeRestoreTime=10us PortTPowerOnTime=10us
L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
T_CommonMode=0us LTR1.2_Threshold=0ns
L1SubCtl2: T_PwrOn=10us
Capabilities: [1ec v1] Vendor Specific Information: ID=0002 Rev=3 Len=100 <?>
Kernel driver in use: bh2
Kernel modules: bh2
No changes were made to device tree as I’m using the Nvidia carrier board.
I’ve added these kernel config options
--- a/arch/arm64/configs/tegra_defconfig
+++ b/arch/arm64/configs/tegra_defconfig
@@ -47,8 +47,11 @@ CONFIG_PARTITION_ADVANCED=y
# CONFIG_IOSCHED_DEADLINE is not set
CONFIG_ARCH_TEGRA=y
CONFIG_PCI=y
+CONFIG_PCI_DEBUG=y
CONFIG_PCIEPORTBUS=y
-CONFIG_PCIEASPM_POWERSAVE=y
+# CONFIG_PCIEASPM_POWERSAVE is not set
+CONFIG_PCIEASPM_PERFORMANCE=y
+CONFIG_PCIEASPM_DEBUG=y
CONFIG_PCI_STUB=m
CONFIG_PCI_IOV=y
CONFIG_PCIE_TEGRA=y
The two devices are not seen
lspci output on jetson-tx2
root@nvidia-tx2:~# lspci
00:01.0 PCI bridge: NVIDIA Corporation Device 10e5 (rev a1)
dmesg output
root@nvidia-tx2:~# dmesg | grep -i pci
[ 0.698700] PCI: CLS 0 bytes, default 64
[ 1.077964] tegra-pcie 10003000.pcie-controller: 4x1, 1x1 configuration
[ 1.079114] tegra-pcie 10003000.pcie-controller: PCIE: Enable power rails
[ 1.079533] tegra-pcie 10003000.pcie-controller: probing port 0, using 4 lanes
[ 1.082854] tegra-pcie 10003000.pcie-controller: probing port 2, using 1 lanes
[ 1.211295] Intel(R) 10GbE PCI Express Linux Network Driver - version 4.6.4
[ 1.222375] ehci-pci: EHCI PCI platform driver
[ 1.222416] ohci-pci: OHCI PCI platform driver
[ 1.530738] tegra-pcie 10003000.pcie-controller: link 2 down, retrying
[ 1.934953] tegra-pcie 10003000.pcie-controller: link 2 down, retrying
[ 2.338874] tegra-pcie 10003000.pcie-controller: link 2 down, retrying
[ 2.340905] tegra-pcie 10003000.pcie-controller: link 2 down, ignoring
[ 2.546608] tegra-pcie 10003000.pcie-controller: PCI host bridge to bus 0000:00
[ 2.546615] pci_bus 0000:00: root bus resource [io 0x0000-0xffff]
[ 2.546619] pci_bus 0000:00: root bus resource [mem 0x40100000-0x47ffffff]
[ 2.546623] pci_bus 0000:00: root bus resource [mem 0x48000000-0x7fffffff pref]
[ 2.546627] pci_bus 0000:00: root bus resource [bus 00-ff]
[ 2.546750] pci 0000:00:01.0: [10de:10e5] type 01 class 0x060400
[ 2.547266] pci 0000:00:01.0: PME# supported from D0 D1 D2 D3hot D3cold
[ 2.547638] pci 0000:00:01.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[ 2.547970] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
[ 2.547988] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 10010001
[ 2.548074] pci 0000:00:01.0: PCI bridge to [bus 01]
[ 2.548596] pcieport 0000:00:01.0: Signaling PME through PCIe PME interrupt
[ 2.548621] pcie_pme 0000:00:01.0:pcie001: service driver pcie_pme loaded
[ 2.548791] aer 0000:00:01.0:pcie002: service driver aer loaded
[ 5.724459] pcieport 0000:00:01.0: AER: Multiple Corrected error received: id=0020
[ 5.724501] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
[ 5.736884] pcieport 0000:00:01.0: device [10de:10e5] error status/mask=00000001/00002000
[ 5.745757] pcieport 0000:00:01.0: [ 0] Receiver Error (First)
[ 5.752981] pcieport 0000:00:01.0: AER: Multiple Corrected error received: id=0020
[ 5.753006] pcieport 0000:00:01.0: can't find device of ID0020
[ 5.753008] pcieport 0000:00:01.0: AER: Multiple Corrected error received: id=0020
[ 5.753031] pcieport 0000:00:01.0: can't find device of ID0020
Forcing a PCI rescan doesn’t help and the kernel warning/errors are not consistent
root@nvidia-tx2:~# dmesg -c > /dev/null; echo 1 > /sys/bus/pci/rescan ; dmesg -c
[ 173.830921] pci_bus 0000:01: busn_res: [bus 01] end is updated to 01
[ 173.830929] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 10010001
oot@nvidia-tx2:~# dmesg -c > /dev/null; echo 1 > /sys/bus/pci/rescan ; dmesg -c
[ 174.709505] pci_bus 0000:01: busn_res: [bus 01] end is updated to 01
[ 174.709657] pcieport 0000:00:01.0: AER: Uncorrected (Non-Fatal) error received: id=0020
[ 174.709707] pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0008(Requester ID)
[ 174.709712] pcieport 0000:00:01.0: device [10de:10e5] error status/mask=00004000/00000000
[ 174.709717] pcieport 0000:00:01.0: [14] Completion Timeout (First)
[ 174.709743] pcieport 0000:00:01.0: broadcast error_detected message
[ 174.709748] pcieport 0000:00:01.0: broadcast mmio_enabled message
[ 174.709752] pcieport 0000:00:01.0: broadcast resume message
[ 174.709769] pcieport 0000:00:01.0: AER: Device recovery successful
I’ve also tried to disable the clock request as suggested in another post, but it didn’t help also
diff --git a/drivers/pci/host/pci-tegra.c b/drivers/pci/host/pci-tegra.c
index 7b6fbd5d90a8..2648af82df56 100644
--- a/drivers/pci/host/pci-tegra.c
+++ b/drivers/pci/host/pci-tegra.c
@@ -3516,6 +3516,7 @@ static int tegra_pcie_parse_dt(struct tegra_pcie *pcie)
return -EADDRNOTAVAIL;
rp->disable_clock_request = of_property_read_bool(port,
"nvidia,disable-clock-request");
+ rp->disable_clock_request = 1;
rp->rst_gpio = of_get_named_gpio(port, "nvidia,rst-gpio", 0);
if (gpio_is_valid(rp->rst_gpio)) {
Any hint or suggestion on what to check next ?
Thanks