Hello,
- L4T 35.4.1
- devicetree tegra234-p3767-0000-p3768-0000-a0.dtb
- tegra234-mb2-bct-scr-p3767-0000.dts patched (Jetson_Linux_Release_Notes_r35.4.1.pdf 4.2.3) and tegra234-mb2-bct-misc-p3767-0000.dts patched (to disable EEPROM)
- default pinmux
- odm gbe-uphy-config-8,hsstp-lane-map-3,hsio-uphy-config-0
- kernel 5.10 patched (from meta-tegra mickledore)
We would like to know if it is possible, based on p3767-0000-p3768-0000 carier board design, to add a M.2 port on the pcie2 to make a custom carrier board to support Orin NX 16G?
Because we have made a prototype, and we have got some issues with the pcieport 0001:00:00.0 @14100000.
kernel boot log:
[…]
[ 4.108650] tegra194-pcie 14100000.pcie: Adding to iommu group 7
[ 4.121163] tegra194-pcie 14100000.pcie: Using GICv2m MSI allocator
[ 4.122172] tegra194-pcie 14160000.pcie: Adding to iommu group 8
[ 4.328510] tegra194-pcie 14160000.pcie: Using GICv2m MSI allocator
[ 4.133510] tegra194-pcie 141e0000.pcie: Adding to iommu group 9
[ 4.144811] tegra194-pcie 141e0000.pcie: Using GICv2m MSI allocator
[ 4.145341] tegra194-pcie 140a0000.pcie: Adding to iommu group 10
[ 4.157356] tegra194-pcie 140a0000.pcie: Using GICv2m MSI allocator
[…]
[ 5.187627] tegra194-pcie 14100000.pcie: Using GICv2m MSI allocator
[ 5.194287] tegra194-pcie 14100000.pcie: host bridge /pcie@14100000 ranges:
[ 5.199710] tegra194-pcie 14100000.pcie: IO 0x0030100000..0x00301fffff -> 0x0030100000
[ 5.208193] tegra194-pcie 14100000.pcie: MEM 0x20a8000000..0x20afffffff -> 0x0040000000
[ 5.216769] tegra194-pcie 14100000.pcie: MEM 0x2080000000..0x20a7ffffff -> 0x2080000000
[ 5.333570] tegra194-pcie 14100000.pcie: Link up
[ 5.334791] tegra194-pcie 14100000.pcie: PCI host bridge to bus 0001:00
[ 5.334974] pci_bus 0001:00: root bus resource [bus 00-ff]
[ 5.335118] pci_bus 0001:00: root bus resource [io 0x0000-0xfffff] (bus address [0x30100000-0x301fffff])
[ 5.335365] pci_bus 0001:00: root bus resource [mem 0x20a8000000-0x20afffffff] (bus address [0x40000000-0x47ffffff])
[ 5.335645] pci_bus 0001:00: root bus resourca [mem 0x2080000000-0x20a7ffffff pref]
[ 5.335907] pci 0001:00:00.0: [10de:229e] type 01 class 0x060400
[ 5.336241] pci 0001:00:00.0: PME# supported from D0 D3hot
[ 5.340418] pci 0001:01:00.0: [1055:7430] type 00 class 0x020000
[ 5.340940] pci 0001:01:00.0: reg 0x10: [mem 0x00000000-0x00001fff 64bit]
[ 5.341337] pci 0001:01:00.0: reg 0x18: [mem 0x00000000-0x000000ff 64bit]
[ 5.341722] pci 0001:01:00.0: reg 0x20: [mem 0x00000000-0x000000ff 64bit]
[ 5.344380] pci 0001:01:00.0: PME# supported from D0 D3hot
[ 5.348473] pci 0001:00:00.0: BAR 14: assigned [mem 0x20a8000000-0x20a80fffff]
[ 5.348660] pci 0001:01:00.0: BAR 0: assigned [mem 0x20a8000000-0x20a8001fff 64bit]
[ 5.349031] pci 0001:01:00.0: BAR 2: assigned [mem 0x20a8002000-0x20a80020ff 64bit]
[ 5.353031] pci 0001:01:00.0: BAR 4: assigned [mem 0x20a8002100-0x20a80021ff 64bit]
[ 5.360712] pci 0001:00:00.0: PCI bridge to [bus 01-ff]
[ 5.365788] pci 0001:00:00.0: bridge window [mem 0x20a8000000-0x20a80fffff]
[ 5.373149] pci 0001:00:00.0: Max Payload Size set to 256/ 256 (was 256), Max Read Rq 512
[ 5.381973] pci 0001:01:00.0: Max Payload Size set to 256/ 512 (was 128), Max Read Rq 512
[ 5.390699] pcieport 0001:00:00.0: Adding to iommu group 7
[ 5.396181] pcieport 0001:00:00.0: PME: Signaling with IRQ 55
[ 5.402460] pcieport 0001:00:00.0: AER: enabled with IRQ 55
[ 5.402475] pcieport 0001:00:00.0: AER: Multiple Corrected error received: 0001:00:00.0
[ 5.415418] pcieport 0001:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 5.424947] pcieport 0001:00:00.0: device [10de:229e] error status/mask=00000081/0000e000
[ 5.433167] pcieport 0001:00:00.0: [ 0] RxErr
[ 5.439292] pcieport 0001:00:00.0: [ 7] BadDLLP
[ 5.445615] pcieport 0001:00:00.0: AER: Multiple Corrected error received: 0001:00:00.0
[ 5.453560] pcieport 0001:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 5.463094] pcieport 0001:00:00.0: device [10de:229e] error status/mask=00000081/0000e000
[ 5.471516] pcieport 0001:00:00.0: [ 0] RxErr
[ 5.477617] pcieport 0001:00:00.0: [ 7] BadDLLP
[…] #infinite loop here
Other pcie seem to have correct boot log.
Sometimes the boot allows a shell session after this kernel crash:
[ 14.263474] ------------[ cut here ]------------
[ 14.263637] WARNING: CPU: 2 PID: 114 at drivers/net/phy/phy.c:963 phy_error+0x1c/0x64
[ 14.263640] Module[ l2m OK d ] Finished ) brcmutil(O) cfg80211 � us�net lpace_aterface to 100Mb/s_hdmi m.pat(O)
tegra_bpmp_thermal snd_hda_tegra snd_hda_codec snd_hda_core spi_tegra114 lan743x r8168 pwm_fan nvidia_drm(O) nvidia_modeset(O) nvidia(O) nvgpu nvmap ina3221
[ 14.264595] CPU: 2 PID: 114 Comm: kworker/u16:1 Tainted: G O 5.10.120-l4t-r35.4.ga+g76678311c10b #1
[ 14.264881] Hardware name: Unknown NVIDIA Orin NX Developer Kit/NVIDIA Orin NX Developer Kit, BIOS v35.4.1 10/16/2023
[ 14.265183] Workqueue: events_power_efficient phy_state_machine
[ 14.265340] pstate: 60c00009 (nZCv daif +PAN +UAO -TCO BTYPE=--)
[ 14.265502] pc : phy_error+0x1c/0x64
[ 14.265843] lr : phy_state_machine+0xa8/0x264
[ 14.266502] sp : ffff8000116f3d20
[ 14.267010] x29: ffff8000116f3d20 x28: ffff1bab81c89600
[ 14.267827] x27: ffff1bab80142470 x26: 00000000fffffef7
[ 14.268646] x25: 0000000000000000 x24: ffff1bab884d54a8
[ 14.270226] x23: 00000000ffffff92 x22: ffff1bab884d54a0
[ 14.275739] x21: ffff1bab884d54f8 x20: 0000000000000003
[ 14.281252] x19: ffff1bab884d5000 x18: 0000000000000000
[ 14.286677] x17: 0000000000000000 x16: ffffd898c8e2be30
[ 14.292189] x15: 0000000000000000 x14: 0000000000000000
[ 14.297613] x13: 0000000000000000 x12: 0000000000000000
[ 14.303038] x11: 0000000000000000 x10: bf13ea9f804ed36e
[ 14.308552] x9 : ffffd898c9772338 x8 : ffffd898ca04a208
[ 14.313975] x7 : ffffd898ca04a240 x6 : 000000003d0de856
[ 14.319401] x5 : 00ffffffffffffff x4 : ffff1bab81c96900
[ 14.324827] x3 : ffff1bab884d54f8 x2 : 0000000000000000
[ 14.330164] x1 : ffff1bab81c96900 x0 : ffff1bab884d5000
[ 14.335503] Call trace:
[ 14.337954] phy_error+0x1c/0x64
[ 14.341366] phy_state_machine+0xa8/0x264
[ 14.345396] process_one_work+0x1fc/0x4bc
[ 14.349416] worker_thread+0x7c/0x460
[ 14.352915] kthread+0x160/0x16c
[ 14.356331] ret_from_fork+0x10/0x38
[ 14.359827] ---[ end trace 76f27c4ebfcece07 ]---
and pcie errors become:
[ 15.437998] pcieport 0001:00:00.0: AER: Root Port link has been reset
[ 15.540596] pcieport 0001:00:00.0: AER: device recovery successful
[ 15.540784] pcieport 0001:00:00.0: AER: Multiple Uncorrected (Non-Fatal) error received: 0001:00:00.0
[ 15.541072] pcieport 0001:00:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver ID)
[ 15.541398] pcieport 0001:00:00.0: device [10de:229e] error status/mask=00000020/00400000
[ 15.541645] pcieport 0001:00:00.0: [ 5] SDES (First)
[ 15.541839] lan743x 0001:01:00.0: AER: can't recover (no error_detected callback)
[ 15.542051] pcieport 0001:00:00.0: AER: device recovery failed
[ 15.542205] pcieport 0001:00:00.0: AER: Multiple Uncorrected (Non-Fatal) error received: 0001:00:00.0
[ 15.542465] pcieport 0001:00:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver ID)
[ 15.543767] pcieport 0001:00:00.0: device [10de:229e] error status/mask=00000020/00400000
[ 15.545039] pcieport 0001:00:00.0: [ 5] SDES (First)
[ 15.546026] lan743x 0001:01:00.0: AER: can't recover (no error_detected callback)
[ 15.552307] pcieport 0001:00:00.0: AER: device recovery failed
[…] #infinite loop here
This allows to get this information from the shell:
sudo lspci -vvv
0001:00:00.0 PCI bridge: NVIDIA Corporation Device 229e (rev a1) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR+ <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 55
IOMMU group: 7
Bus: primary=00, secondary=01, subordinate=ff, sec-latency=0
I/O behind bridge: f000-0fff [disabled] [16-bit]
Memory behind bridge: a8000000-a80fffff [size=1M] [32-bit]
Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff [disabled] [64-bit]
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
BridgeCtl: Parity- SERR+ NoISA- VGA- VGA16- MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [70] Express (v2) Root Port (Slot-), MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0
ExtTag- RBE+
DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 256 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- NonFatalErr+ FatalErr- UnsupReq- AuxPwr+ TransPend-
LnkCap: Port #0, Speed 16GT/s, Width x1, ASPM not supported
ClockPM- Surprise+ LLActRep+ BwNot+ ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt+ AutBWInt-
LnkSta: Speed 2.5GT/s, Width x1
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
RootCap: CRSVisible+
RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible+
RootSta: PME ReqID 0000, PMEStatus- PMEPending-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP+ LTR+
10BitTagComp+ 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS- LN System CLS Not Supported, TPHComp- ExtTPHComp- ARIFwd+
AtomicOpsCap: Routing- 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- 10BitTagReq- OBFF Disabled, ARIFwd-
AtomicOpsCtl: ReqEn- EgressBlck-
LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot
LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete- EqualizationPhase1-
EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
Retimer- 2Retimers- CrosslinkRes: Downstream Port
Capabilities: [b0] MSI-X: Enable- Count=1 Masked-
Vector table: BAR=0 offset=00000000
PBA: BAR=0 offset=00000000
Capabilities: [100 v2] Advanced Error Reporting
UESta: DLP- SDES+ TLP- FCP- CmpltTO+ CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
AERCap: First Error Pointer: 05, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
MultHdrRecCap+ MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 00000000 00000000 00000000 00000000
RootCmd: CERptEn+ NFERptEn+ FERptEn+
RootSta: CERcvd- MultCERcvd- UERcvd- MultUERcvd-
FirstFatal- NonFatalMsg- FatalMsg- IntMsg 0
ErrorSrc: ERR_COR: 0000 ERR_FATAL/NONFATAL: 0000
Capabilities: [148 v1] Secondary PCI Express
LnkCtl3: LnkEquIntrruptEn- PerformEqu-
LaneErrStat: 0
Capabilities: [158 v1] Physical Layer 16.0 GT/s <?>
Capabilities: [17c v1] Lane Margining at the Receiver <?>
Capabilities: [190 v1] L1 PM Substates
L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2- ASPM_L1.1- L1_PM_Substates+
PortCommonModeRestoreTime=60us PortTPowerOnTime=40us
L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
T_CommonMode=10us
L1SubCtl2: T_PwrOn=10us
Capabilities: [1a0 v1] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?>
Capabilities: [2a0 v1] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
Capabilities: [2d8 v1] Data Link Feature <?>
Capabilities: [2e4 v1] Precision Time Measurement
PTMCap: Requester:- Responder:+ Root:+
PTMClockGranularity: 16ns
PTMControl: Enabled:- RootSelected:-
PTMEffectiveGranularity: Unknown
Capabilities: [2f0 v1] Vendor Specific Information: ID=0004 Rev=1 Len=054 <?>
Capabilities: [358 v1] Vendor Specific Information: ID=0006 Rev=0 Len=018 <?>
Kernel driver in use: pcieport
0001:01:00.0 Ethernet controller: Microchip Technology / SMSC LAN7430 (rev 11)
Subsystem: Microchip Technology / SMSC LAN7430
!!! Unknown header type 7f
Interrupt: pin ? routed to IRQ 55
IOMMU group: 7
Region 0: Memory at 20a8000000 (64-bit, non-prefetchable) [size=8K]
Region 2: Memory at 20a8002000 (64-bit, non-prefetchable) [size=256]
Region 4: Memory at 20a8002100 (64-bit, non-prefetchable) [size=256]
Kernel driver in use: lan743x
Kernel modules: lan743x
the 0001:01:00.0 Ethernet controller’s information seems uncorrect. Other pcie seem to obtain correct data.
Could you help us to solve this issue? Thank you in advance.