OrinNX's pcie_rp@141e0000 The Marvell AQC113 driver connected to Lane 2 failed to star

dmesg:
[ 1438.346418] hot-surface-alert cooling state: 0 → 1
[16188.438044] atlantic 0001:01:00.0 (unnamed net_device) (uninitialized): Hardware revision 0x0
[16202.455585] atlantic: Loaded MAC FW Version: 90.217.65535
[16202.455593] atlantic: Waiting for MAC FW to complete boot and start PHY FW…
[16212.464236] atlantic: Error: Timeout waiting for MAC FW to finish boot
[16212.464238] atlantic: Failure regdump:
[16212.468109] atlantic: rr 0x3040 = 0xd000000
[16212.472601] atlantic: rr 0x03f0 = 0x8000306f
[16212.477177] atlantic: rr 0x0354 = 0x4080801c
[16212.481756] atlantic: rr 0x0358 = 0x9fc02915
[16212.486332] atlantic: rr 0x035c = 0x7f7ffafd
[16212.490909] atlantic: rr 0x308c = 0x20000
[16212.495217] atlantic: HW prepare failed, err = -110
[16212.502949] atlantic: probe of 0001:01:00.0 failed with error -110
[29675.918826] pci 0001:01:00.0: VPD access failed. This is likely a firmware bug on this device. Contact the card vendor for a firmware update
[29675.954111] r8168 0008:01:00.0: invalid short VPD tag 00 at offset 1

lspci -vvv:

0001:01:00.0 Ethernet controller: Aquantia Corp. Device 94c0 (rev 03)
Subsystem: Aquantia Corp. Device 0001
Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 55
Region 0: Memory at 20a8400000 (64-bit, non-prefetchable) [size=512K]
Region 2: Memory at 20a8480000 (64-bit, non-prefetchable) [size=4K]
Region 4: Memory at 20a8000000 (64-bit, non-prefetchable) [size=4M]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable- Count=1/32 Maskable+ 64bit+
Address: 0000000000000000 Data: 0000
Masking: 00000000 Pending: 00000000
Capabilities: [70] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
MaxPayload 256 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
LnkCap: Port #0, Speed 16GT/s, Width x2, ASPM L0s L1, Exit Latency L0s <4us, L1 <64us
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 16GT/s (ok), Width x1 (downgraded)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Not Supported, TimeoutDis+, NROPrPrP-, LTR+
10BitTagComp+, 10BitTagReq-, OBFF Via message/WAKE#, ExtFmt-, EETLPPrefix-
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS-, TPHComp-, ExtTPHComp-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled
AtomicOpsCtl: ReqEn-
LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+
EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
Capabilities: [b0] MSI-X: Enable- Count=32 Masked-
Vector table: BAR=2 offset=00000000
PBA: BAR=2 offset=00000200
Capabilities: [d0] Vital Product Data

lspci:
0001:00:00.0 PCI bridge: NVIDIA Corporation Device 229e (rev a1)
0001:01:00.0 Ethernet controller: Aquantia Corp. Device 94c0 (rev 03)
0007:00:00.0 PCI bridge: NVIDIA Corporation Device 229a (rev a1)
0007:01:00.0 Non-Volatile memory controller: SK hynix Device 174a
0008:00:00.0 PCI bridge: NVIDIA Corporation Device 229c (rev a1)
0008:01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)
0009:00:00.0 PCI bridge: NVIDIA Corporation Device 229c (rev a1)

Does this need to be modified for PCIe? Based on your assessment, this is a hardware or software issue?

Looking forward to answering questions!

Looks like PCIe side already detected the device. Please contact PHY vendor to check.

Yes, if the device can be detected, does it mean that PCIe’s device tree does not need to be configured?

Yes, pcie part has been done here.

pcie@141e0000 {
compatible = “nvidia,tegra234-pcie\0snps,dw-pcie”;
power-domains = <0x02 0x10>;
reg = <0x00 0x141e0000 0x00 0x20000 0x00 0x3e000000 0x00 0x40000 0x00 0x3e040000 0x00 0x40000 0x00 0x3e080000 0x00 0x40000 0x32 0x30000000 0x00 0x10000000>;
reg-names = “appl\0config\0atu_dma\0dbi\0ecam”;
status = “okay”;
#address-cells = <0x03>;
#size-cells = <0x02>;
device_type = “pci”;
num-lanes = <0x02>;
num-viewport = <0x08>;
linux,pci-domain = <0x07>;
clocks = <0x02 0xab 0x02 0xf4>;
clock-names = “core\0core_m”;
resets = <0x02 0x0f 0x02 0x0e>;
reset-names = “apb\0core”;
interrupts = <0x00 0x162 0x04 0x00 0x163 0x04>;
interrupt-names = “intr\0msi”;
interconnects = <0x03 0x2a 0x03 0x30>;
interconnect-names = “dma-mem\0dma-mem”;
iommus = <0x4e 0x08>;
iommu-map = <0x00 0x4e 0x08 0x1000>;
msi-parent = <0x4a 0x08>;
msi-map = <0x00 0x4a 0x08 0x1000>;
dma-coherent;
iommu-map-mask = <0x00>;
#interrupt-cells = <0x01>;
interrupt-map-mask = <0x00 0x00 0x00 0x00>;
interrupt-map = <0x00 0x00 0x00 0x00 0x01 0x00 0x162 0x04>;
nvidia,dvfs-tbl = <0xc28cb00 0xc28cb00 0xc28cb00 0xc28cb00 0xc28cb00 0xc28cb00 0xc28cb00 0x27ac4000 0xc28cb00 0xc28cb00 0x27ac4000 0x5f5e1000 0xc28cb00 0x27ac4000 0x5f5e1000 0x7f22ff40>;
nvidia,max-speed = <0x04>;
nvidia,disable-aspm-states = <0x0f>;
nvidia,controller-id = <0x02 0x07>;
nvidia,tsa-config = <0x200b004>;
nvidia,disable-l1-cpm;
nvidia,aux-clk-freq = <0x13>;
nvidia,preset-init = <0x05>;
nvidia,aspm-cmrt = <0x3c>;
nvidia,aspm-pwr-on-t = <0x14>;
nvidia,aspm-l0s-entrance-latency = <0x03>;
nvidia,bpmp = <0x02 0x07>;
nvidia,aspm-cmrt-us = <0x3c>;
nvidia,aspm-pwr-on-t-us = <0x14>;
nvidia,aspm-l0s-entrance-latency-us = <0x03>;
bus-range = <0x00 0xff>;
ranges = <0x81000000 0x00 0x3e100000 0x00 0x3e100000 0x00 0x100000 0x82000000 0x00 0x40000000 0x32 0x28000000 0x00 0x8000000 0xc3000000 0x2e 0x40000000 0x2e 0x40000000 0x03 0xe8000000>;
nvidia,cfg-link-cap-l1sub = <0x1c4>;
nvidia,cap-pl16g-status = <0x174>;
nvidia,cap-pl16g-cap-off = <0x188>;
nvidia,event-cntr-ctrl = <0x1d8>;
nvidia,event-cntr-data = <0x1dc>;
nvidia,dl-feature-cap = <0x30c>;
nvidia,ptm-cap-off = <0x318>;
vddio-pex-ctl-supply = <0x4f>;
phys = <0x56>;
phy-names = “p2u-0”;
nvidia,disable-power-down;
nvidia,disable-clock-request;
phandle = <0x302>;
};

For my current PCIe device tree, this interface is connected to AQC113CS x2 Gen4. Is there a problem with the device tree?

PCIe looks fine, the problem seems to be with the device. Messages indicate the MAC firmware failed to boot. If you’ve got the Marvell dev tools (under NDA from Marvell) you can debug the reason.

We’re using an AQC113CS on Orin NX 16GB / Orin Nano dev kit, connected to C1. R36.3 and the in box kernel atlantic driver - all works fine. I think maybe you are using the atlantic driver from the Marvell website.

Are you using an off-the-shelf PCIe adapter? Did you test it works on another system?

Thank you, I have resolved the issue. It was a hardware soldering problem

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.