PCIe not being recognized

Hello,

I am trying to get a PCIe card, that works perfectly on an Intel x86 machine, to be recognized by the NVIDIA TX2 development kit but so far have had luck. I am running the latest version of Jetpack 4.2.
The card has two devices, each device is a PCIe x2 Gen 2 device.

Kernel: 4.9.140
L4T: 32.2
Rootfs: 18.04.2 LTS (Bionic Beaver)

lspci output on x86

root@x86# lspci -vvv
08:00.0 Ethernet controller: Microsemi / PMC / IDT Device 80e8 (rev 01)
	Subsystem: Microsemi / PMC / IDT Device 0001
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 128
	Region 0: Memory at d0000000 (64-bit, non-prefetchable) 
	Region 2: Memory at f0100000 (64-bit, prefetchable) 
	Region 4: Memory at f0200000 (64-bit, prefetchable) 
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [50] MSI: Enable+ Count=32/32 Maskable+ 64bit+
		Address: 00000000fee00418  Data: 0000
		Masking: fffffffc  Pending: 00000000
	Capabilities: [70] Express (v2) Endpoint, MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 25.000W
		DevCtl:	Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop- FLReset-
			MaxPayload 128 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 5GT/s, Width x2, ASPM L0s L1, Exit Latency L0s <512ns, L1 <2us
			ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x2, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Via message/WAKE#
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled
		LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [100 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
	Capabilities: [148 v1] Transaction Processing Hints
		Device specific mode supported
		No steering table available
	Capabilities: [1d4 v1] Latency Tolerance Reporting
		Max snoop latency: 0ns
		Max no snoop latency: 0ns
	Capabilities: [1dc v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
			  PortCommonModeRestoreTime=10us PortTPowerOnTime=10us
		L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
			   T_CommonMode=0us LTR1.2_Threshold=0ns
		L1SubCtl2: T_PwrOn=10us
	Capabilities: [1ec v1] Vendor Specific Information: ID=0002 Rev=3 Len=100 <?>
	Kernel driver in use: bh2
	Kernel modules: bh2

No changes were made to device tree as I’m using the Nvidia carrier board.
I’ve added these kernel config options

--- a/arch/arm64/configs/tegra_defconfig
+++ b/arch/arm64/configs/tegra_defconfig
@@ -47,8 +47,11 @@ CONFIG_PARTITION_ADVANCED=y
 # CONFIG_IOSCHED_DEADLINE is not set
 CONFIG_ARCH_TEGRA=y
 CONFIG_PCI=y
+CONFIG_PCI_DEBUG=y
 CONFIG_PCIEPORTBUS=y
-CONFIG_PCIEASPM_POWERSAVE=y
+# CONFIG_PCIEASPM_POWERSAVE is not set
+CONFIG_PCIEASPM_PERFORMANCE=y
+CONFIG_PCIEASPM_DEBUG=y
 CONFIG_PCI_STUB=m
 CONFIG_PCI_IOV=y
 CONFIG_PCIE_TEGRA=y

The two devices are not seen
lspci output on jetson-tx2

root@nvidia-tx2:~# lspci
00:01.0 PCI bridge: NVIDIA Corporation Device 10e5 (rev a1)

dmesg output

root@nvidia-tx2:~# dmesg | grep -i pci 
[    0.698700] PCI: CLS 0 bytes, default 64
[    1.077964] tegra-pcie 10003000.pcie-controller: 4x1, 1x1 configuration
[    1.079114] tegra-pcie 10003000.pcie-controller: PCIE: Enable power rails
[    1.079533] tegra-pcie 10003000.pcie-controller: probing port 0, using 4 lanes
[    1.082854] tegra-pcie 10003000.pcie-controller: probing port 2, using 1 lanes
[    1.211295] Intel(R) 10GbE PCI Express Linux Network Driver - version 4.6.4
[    1.222375] ehci-pci: EHCI PCI platform driver
[    1.222416] ohci-pci: OHCI PCI platform driver
[    1.530738] tegra-pcie 10003000.pcie-controller: link 2 down, retrying
[    1.934953] tegra-pcie 10003000.pcie-controller: link 2 down, retrying
[    2.338874] tegra-pcie 10003000.pcie-controller: link 2 down, retrying
[    2.340905] tegra-pcie 10003000.pcie-controller: link 2 down, ignoring
[    2.546608] tegra-pcie 10003000.pcie-controller: PCI host bridge to bus 0000:00
[    2.546615] pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
[    2.546619] pci_bus 0000:00: root bus resource [mem 0x40100000-0x47ffffff]
[    2.546623] pci_bus 0000:00: root bus resource [mem 0x48000000-0x7fffffff pref]
[    2.546627] pci_bus 0000:00: root bus resource [bus 00-ff]
[    2.546750] pci 0000:00:01.0: [10de:10e5] type 01 class 0x060400
[    2.547266] pci 0000:00:01.0: PME# supported from D0 D1 D2 D3hot D3cold
[    2.547638] pci 0000:00:01.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[    2.547970] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
[    2.547988] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 10010001
[    2.548074] pci 0000:00:01.0: PCI bridge to [bus 01]
[    2.548596] pcieport 0000:00:01.0: Signaling PME through PCIe PME interrupt
[    2.548621] pcie_pme 0000:00:01.0:pcie001: service driver pcie_pme loaded
[    2.548791] aer 0000:00:01.0:pcie002: service driver aer loaded
[    5.724459] pcieport 0000:00:01.0: AER: Multiple Corrected error received: id=0020
[    5.724501] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
[    5.736884] pcieport 0000:00:01.0:   device [10de:10e5] error status/mask=00000001/00002000
[    5.745757] pcieport 0000:00:01.0:    [ 0] Receiver Error         (First)
[    5.752981] pcieport 0000:00:01.0: AER: Multiple Corrected error received: id=0020
[    5.753006] pcieport 0000:00:01.0: can't find device of ID0020
[    5.753008] pcieport 0000:00:01.0: AER: Multiple Corrected error received: id=0020
[    5.753031] pcieport 0000:00:01.0: can't find device of ID0020

Forcing a PCI rescan doesn’t help and the kernel warning/errors are not consistent

root@nvidia-tx2:~# dmesg -c > /dev/null; echo 1 > /sys/bus/pci/rescan ; dmesg -c
[  173.830921] pci_bus 0000:01: busn_res: [bus 01] end is updated to 01
[  173.830929] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 10010001
 oot@nvidia-tx2:~# dmesg -c > /dev/null; echo 1 > /sys/bus/pci/rescan ; dmesg -c
[  174.709505] pci_bus 0000:01: busn_res: [bus 01] end is updated to 01
[  174.709657] pcieport 0000:00:01.0: AER: Uncorrected (Non-Fatal) error received: id=0020
[  174.709707] pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0008(Requester ID)
[  174.709712] pcieport 0000:00:01.0:   device [10de:10e5] error status/mask=00004000/00000000
[  174.709717] pcieport 0000:00:01.0:    [14] Completion Timeout     (First)
[  174.709743] pcieport 0000:00:01.0: broadcast error_detected message
[  174.709748] pcieport 0000:00:01.0: broadcast mmio_enabled message
[  174.709752] pcieport 0000:00:01.0: broadcast resume message
[  174.709769] pcieport 0000:00:01.0: AER: Device recovery successful

I’ve also tried to disable the clock request as suggested in another post, but it didn’t help also

diff --git a/drivers/pci/host/pci-tegra.c b/drivers/pci/host/pci-tegra.c
    index 7b6fbd5d90a8..2648af82df56 100644
    --- a/drivers/pci/host/pci-tegra.c
    +++ b/drivers/pci/host/pci-tegra.c
    @@ -3516,6 +3516,7 @@ static int tegra_pcie_parse_dt(struct tegra_pcie *pcie)
                            return -EADDRNOTAVAIL;
                    rp->disable_clock_request = of_property_read_bool(port,
                            "nvidia,disable-clock-request");
    +               rp->disable_clock_request = 1;

                    rp->rst_gpio = of_get_named_gpio(port, "nvidia,rst-gpio", 0);
                    if (gpio_is_valid(rp->rst_gpio)) {

Any hint or suggestion on what to check next ?
Thanks

From the logs, it looks like the device got enumerated by the system but for some reason, it has fallen off the bus and hence we see physical layer errors and Completion timeouts when tried rescanning it.
Did you confirm if the PCIe driver is built into the kernel or as a module? (‘sudo lsmod’ would give us that info. If it is built as a module, just updating the kernel may not help. You may have to copy pci-tegra.ko to target in that case. Also, just to be sure that the system is indeed using an updated driver, can you add a print and confirm?

Hello Vidyas,

As suggested, I’m pasting the same logs with a new explicit printk “PCIE: Disable clock request” to make sure that the updated driver is being run. I’ve also captured the dmesg output with a PCIe network adapter (1x) that’s working correctly on both an Intel PC and the NVIDIA TX2.

dmesg output of the 2x Ethernet adapter (NOK)

PCIE: tegra_pcie_probe(4758)
PCIE: tegra_pcie_read_plat_data(3243)
PCIE: tegra_pcie_parse_dt(3345)
tegra-pcie 10003000.pcie-controller: PCIE: Disable clock request
tegra-pcie 10003000.pcie-controller: PCIE: Disable clock request
tegra-pcie 10003000.pcie-controller: 4x1, 1x1 configuration
PCIE: tegra_pcie_probe_complete(4638)
PCIE: tegra_pcie_init(2805)
PCIE: tegra_pcie_get_resources(1940)
PCIE: tegra_pcie_get_clocks(1303)
PCIE: tegra_pcie_enable_regulators(1553)
tegra-pcie 10003000.pcie-controller: PCIE: Enable power rails
PCIE: tegra_pcie_power_on(1824)
PCIE: tegra_pcie_restore_device(1765)
PCIE: tegra_pcie_module_power_on(1717)
PCIE: tegra_pcie_enable_regulators(1553)
PCIE: tegra_pcie_map_resources(1598)
PCIE: tegra_pcie_enable_pads(1442)
PCIE: tegra_pcie_enable_controller(1485)
PCIE: tegra_pcie_enable_msi(3042)
PCIE: tegra_pcie_check_ports(2496)
tegra-pcie 10003000.pcie-controller: probing port 0, using 4 lanes
PCIE: tegra_pcie_port_enable(2024)
PCIE: tegra_pcie_port_reset(1995)
PCIE: tegra_pcie_enable_rp_features(2365)
PCIE: tegra_pcie_enable_aer(1044)
PCIE: tegra_pcie_apply_sw_war(2207)
PCIE: tegra_pcie_prsnt_map_override(1102)
tegra-pcie 10003000.pcie-controller: probing port 2, using 1 lanes
PCIE: tegra_pcie_port_enable(2024)
PCIE: tegra_pcie_port_reset(1995)
PCIE: tegra_pcie_enable_rp_features(2365)
PCIE: tegra_pcie_enable_aer(1044)
PCIE: tegra_pcie_apply_sw_war(2207)
PCIE: tegra_pcie_prsnt_map_override(1102)
tegra-pcie 10003000.pcie-controller: link 0 down, retrying
PCIE: tegra_pcie_port_reset(1995)
tegra-pcie 10003000.pcie-controller: link 0 down, retrying
PCIE: tegra_pcie_port_reset(1995)
tegra-pcie 10003000.pcie-controller: link 2 down, retrying
PCIE: tegra_pcie_port_reset(1995)
tegra-pcie 10003000.pcie-controller: link 2 down, retrying
PCIE: tegra_pcie_port_reset(1995)
tegra-pcie 10003000.pcie-controller: link 2 down, retrying
PCIE: tegra_pcie_port_reset(1995)
tegra-pcie 10003000.pcie-controller: link 2 down, ignoring
PCIE: tegra_pcie_port_disable(2055)
PCIE: tegra_pcie_link_speed(2778)
PCIE: tegra_pcie_scale_voltage(2653)
PCIE: tegra_pcie_conf_gpios(2550)
PCIE: tegra_pcie_enable_msi(3042)
tegra-pcie 10003000.pcie-controller: PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
pci_bus 0000:00: root bus resource [mem 0x40100000-0x47ffffff]
pci_bus 0000:00: root bus resource [mem 0x48000000-0x7fffffff pref]
pci_bus 0000:00: root bus resource [bus 00-ff]
pci_bus 0000:00: scanning bus
pci 0000:00:01.0: [10de:10e5] type 01 class 0x060400
pci 0000:00:01.0: PME# supported from D0 D1 D2 D3hot D3cold
pci 0000:00:01.0: PME# disabled
iommu: Adding device 0000:00:01.0 to group 55
arm-smmu: forcing sodev map for 0000:00:01.0
pci_bus 0000:00: fixups for bus
pci 0000:00:01.0: scanning [bus 00-00] behind bridge, pass 0
pci 0000:00:01.0: bridge configuration invalid ([bus 00-00]), reconfiguring
pci 0000:00:01.0: scanning [bus 00-00] behind bridge, pass 1
PCIE: tegra_pcie_add_bus(780)
PCIE: tegra_pcie_bus_alloc(592)
pci_bus 0000:01: scanning bus
PCIE: tegra_pcie_isr(1211)
tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 10010001
pci_bus 0000:01: fixups for bus
pci_bus 0000:01: bus scan returning with max=01
pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
pci_bus 0000:00: bus scan returning with max=01
pci 0000:00:01.0: assign IRQ: got 381
pci 0000:00:01.0: assigning IRQ 381
pci 0000:00:01.0: PCI bridge to [bus 01]
PCIE: tegra_pcie_configure_aspm(920)
PCIE: tegra_pcie_enable_ltr_support(879)
PCIE: tegra_pcie_enable_features(2790)
PCIE: tegra_pcie_apply_sw_war(2207)
pcieport 0000:00:01.0: enabling bus mastering
pcieport 0000:00:01.0: Signaling PME through PCIe PME interrupt
pcie_pme 0000:00:01.0:pcie001: service driver pcie_pme loaded
aer 0000:00:01.0:pcie002: service driver aer loaded
PCIE: tegra_pcie_isr(1211)
PCIE: handle_sb_intr(1144)
...
pcieport 0000:00:01.0: AER: Multiple Corrected error received: id=0020
pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
pcieport 0000:00:01.0:   device [10de:10e5] error status/mask=00000001/00002000
pcieport 0000:00:01.0:    [ 0] Receiver Error         (First)
pcieport 0000:00:01.0: AER: Multiple Corrected error received: id=0020
pcieport 0000:00:01.0: can't find device of ID0020
pcieport 0000:00:01.0: AER: Multiple Corrected error received: id=0020
pcieport 0000:00:01.0: can't find device of ID0020
pcieport 0000:00:01.0: AER: Multiple Corrected error received: id=0020
pcieport 0000:00:01.0: can't find device of ID0020

dmesg output of a 1x Ethernet adapter (TP-Link) (OK)

PCIE: tegra_pcie_probe(4758)
PCIE: tegra_pcie_read_plat_data(3243)
PCIE: tegra_pcie_parse_dt(3345)
tegra-pcie 10003000.pcie-controller: PCIE: Disable clock request
tegra-pcie 10003000.pcie-controller: PCIE: Disable clock request
tegra-pcie 10003000.pcie-controller: 4x1, 1x1 configuration
PCIE: tegra_pcie_probe_complete(4638)
PCIE: tegra_pcie_init(2805)
PCIE: tegra_pcie_get_resources(1940)
PCIE: tegra_pcie_get_clocks(1303)
PCIE: tegra_pcie_enable_regulators(1553)
tegra-pcie 10003000.pcie-controller: PCIE: Enable power rails
PCIE: tegra_pcie_power_on(1824)
PCIE: tegra_pcie_restore_device(1765)
PCIE: tegra_pcie_module_power_on(1717)
PCIE: tegra_pcie_enable_regulators(1553)
PCIE: tegra_pcie_map_resources(1598)
PCIE: tegra_pcie_enable_pads(1442)
PCIE: tegra_pcie_enable_controller(1485)
PCIE: tegra_pcie_enable_msi(3042)
PCIE: tegra_pcie_check_ports(2496)
tegra-pcie 10003000.pcie-controller: probing port 0, using 4 lanes
PCIE: tegra_pcie_port_enable(2024)
PCIE: tegra_pcie_port_reset(1995)
PCIE: tegra_pcie_enable_rp_features(2365)
PCIE: tegra_pcie_enable_aer(1044)
PCIE: tegra_pcie_apply_sw_war(2207)
PCIE: tegra_pcie_prsnt_map_override(1102)
tegra-pcie 10003000.pcie-controller: probing port 2, using 1 lanes
PCIE: tegra_pcie_port_enable(2024)
PCIE: tegra_pcie_port_reset(1995)
PCIE: tegra_pcie_enable_rp_features(2365)
PCIE: tegra_pcie_enable_aer(1044)
PCIE: tegra_pcie_apply_sw_war(2207)
PCIE: tegra_pcie_prsnt_map_override(1102)
tegra-pcie 10003000.pcie-controller: link 2 down, retrying
PCIE: tegra_pcie_port_reset(1995)
tegra-pcie 10003000.pcie-controller: link 2 down, retrying
PCIE: tegra_pcie_port_reset(1995)
tegra-pcie 10003000.pcie-controller: link 2 down, retrying
PCIE: tegra_pcie_port_reset(1995)
tegra-pcie 10003000.pcie-controller: link 2 down, ignoring
PCIE: tegra_pcie_port_disable(2055)
PCIE: tegra_pcie_link_speed(2778)
PCIE: tegra_pcie_scale_voltage(2653)
PCIE: tegra_pcie_conf_gpios(2550)
PCIE: tegra_pcie_enable_msi(3042)
tegra-pcie 10003000.pcie-controller: PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
pci_bus 0000:00: root bus resource [mem 0x40100000-0x47ffffff]
pci_bus 0000:00: root bus resource [mem 0x48000000-0x7fffffff pref]
pci_bus 0000:00: root bus resource [bus 00-ff]
pci_bus 0000:00: scanning bus
pci 0000:00:01.0: [10de:10e5] type 01 class 0x060400
pci 0000:00:01.0: PME# supported from D0 D1 D2 D3hot D3cold
pci 0000:00:01.0: PME# disabled
iommu: Adding device 0000:00:01.0 to group 55
arm-smmu: forcing sodev map for 0000:00:01.0
pci_bus 0000:00: fixups for bus
pci 0000:00:01.0: scanning [bus 00-00] behind bridge, pass 0
pci 0000:00:01.0: bridge configuration invalid ([bus 00-00]), reconfiguring
pci 0000:00:01.0: scanning [bus 00-00] behind bridge, pass 1
PCIE: tegra_pcie_add_bus(780)
PCIE: tegra_pcie_bus_alloc(592)
pci_bus 0000:01: scanning bus
pci 0000:01:00.0: [10ec:8168] type 00 class 0x020000
pci 0000:01:00.0: reg 0x10: [io  0x0000-0x00ff]
pci 0000:01:00.0: reg 0x18: [mem 0x00000000-0x00000fff 64bit]
pci 0000:01:00.0: reg 0x20: [mem 0x00000000-0x00003fff 64bit pref]
pci 0000:01:00.0: supports D1 D2
pci 0000:01:00.0: PME# supported from D0 D1 D2 D3hot D3cold
pci 0000:01:00.0: PME# disabled
iommu: Adding device 0000:01:00.0 to group 56
arm-smmu: forcing sodev map for 0000:01:00.0
pci_bus 0000:01: fixups for bus
pci_bus 0000:01: bus scan returning with max=01
pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
pci_bus 0000:00: bus scan returning with max=01
pci 0000:00:01.0: assign IRQ: got 381
pci 0000:00:01.0: assigning IRQ 381
pci 0000:01:00.0: assign IRQ: got 381
pci 0000:01:00.0: assigning IRQ 381
pci 0000:00:01.0: BAR 14: assigned [mem 0x40100000-0x401fffff]
pci 0000:00:01.0: BAR 15: assigned [mem 0x48000000-0x480fffff 64bit pref]
pci 0000:00:01.0: BAR 13: assigned [io  0x1000-0x1fff]
pci 0000:01:00.0: BAR 4: assigned [mem 0x48000000-0x48003fff 64bit pref]
pci 0000:01:00.0: BAR 2: assigned [mem 0x40100000-0x40100fff 64bit]
pci 0000:01:00.0: BAR 0: assigned [io  0x1000-0x10ff]
pci 0000:00:01.0: PCI bridge to [bus 01]
pci 0000:00:01.0:   bridge window [io  0x1000-0x1fff]
pci 0000:00:01.0:   bridge window [mem 0x40100000-0x401fffff]
pci 0000:00:01.0:   bridge window [mem 0x48000000-0x480fffff 64bit pref]
PCIE: tegra_pcie_configure_aspm(920)
PCIE: tegra_pcie_enable_ltr_support(879)
PCIE: tegra_pcie_enable_features(2790)
PCIE: tegra_pcie_apply_sw_war(2207)
pcieport 0000:00:01.0: enabling device (0000 -> 0003)
pcieport 0000:00:01.0: enabling bus mastering
pcieport 0000:00:01.0: Signaling PME through PCIe PME interrupt
pci 0000:01:00.0: Signaling PME through PCIe PME interrupt
pcie_pme 0000:00:01.0:pcie001: service driver pcie_pme loaded
aer 0000:00:01.0:pcie002: service driver aer loaded
r8168 0000:01:00.0: enabling device (0000 -> 0003)
r8168 0000:01:00.0: enabling bus mastering
r8168 Gigabit Ethernet driver 8.045.08-NAPI loaded
r8168 0000:01:00.0: enabling Mem-Wr-Inval
PCIE: tegra_msi_setup_irq(2980)
PCIE: tegra_msi_alloc(2893)
PCIE: tegra_msi_map(3025)
r8168: This product is covered by one or more of the following patents: US6,570,884, US6,115,776, and US6,327,625.
r8168  Copyright (C) 2017  Realtek NIC software team <nicfae@realtek.com> 
 This program comes with ABSOLUTELY NO WARRANTY; for details, please see <http://www.gnu.org/licenses/>. 
 This is free software, and you are welcome to redistribute it under certain conditions; see <http://www.gnu.org/licenses/>. 
IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready
eth1: 0xffffff8008069000, 50:3e:aa:0b:78:5c, IRQ 446
IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready
r8168: eth1: link up
IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
PCIE: tegra_pcie_msi_irq(2931)
PCIE: tegra_pcie_msi_irq(2931)
...

The PCIe enumeration is also failing at the u-boot level.

  • u-boot pci enum of the 2x Ethernet adapter (NOK)
Tegra186 (P2771-0000-500) # pci enum
Tegra186 (P2771-0000-500) # pci 0
Scanning PCI devices on bus 0
BusDevFun  VendorId   DeviceId   Device Class       Sub-Class
_____________________________________________________________
00.01.00   0x10de     0x10e5     Bridge device           0x04
Tegra186 (P2771-0000-500) # pci 1
Scanning PCI devices on bus 1
BusDevFun  VendorId   DeviceId   Device Class       Sub-Class
_____________________________________________________________
Tegra186 (P2771-0000-500) # pci 2
No such bus
  • u-boot pci enum of the 1x Ethernet adapter (OK)
Tegra186 (P2771-0000-500) # pci enum
Tegra186 (P2771-0000-500) # pci 0
Scanning PCI devices on bus 0
BusDevFun  VendorId   DeviceId   Device Class       Sub-Class
_____________________________________________________________
00.01.00   0x10de     0x10e5     Bridge device           0x04
Tegra186 (P2771-0000-500) # pci 1
Scanning PCI devices on bus 1
BusDevFun  VendorId   DeviceId   Device Class       Sub-Class
_____________________________________________________________
01.00.00   0x10ec     0x8168     Network controller      0x00

Hi Vidyas,

FYI, disabling the Clock Spread Spectrum, doesn’t help also.

Would it help if I share with you the evolution of the LTSSM here, I’ve a small for that ?

#!/bin/bash
########################################################################
# This script will dump the ltssm buffer every second to a temp file,
# duplicate lines can be removed with the next command:
# cat temp_file | awk '!x[$0]++' 
########################################################################

LTSSM_TRACE=/sys/kernel/debug/pcie/0/dump_ltssm_trace

OUTPUT_TRACE=/tmp/ltssm_trace_redundant

# wait for the creation of the debug file
while [ ! -f $LTSSM_TRACE ]; do sleep 1; done

# clear the output file
echo "" > $OUTPUT_TRACE

while true;do
	cat $LTSSM_TRACE >> $OUTPUT_TRACE;
	sleep 1;
done

I’m not sure what is going wrong here. We may need that card locally to have PCIe protocol analyzer connected to it in between and see why the link is going down.