pcie ethernet i210 flash failed

Hi,
In our carrier board, I210 in UPHY1 is used. I need flash the external SPI flash. However, when I run the flash cmd, SError occurs.
Please help me to point the reason.

failed log shown below:

01:00.0 -r temp.txtop:~/yaccor$ sudo …/flashrom -VVVV -p nicintel_spi:pci=00010
flashrom on Linux 4.9.140-tegra (aarch64)
flashrom is free software, get the source code at https://flashrom.org

flashrom was built with libpci 3.6.2, GCC 7.4.0, little endian
Command line (5 args): …/flashrom -VVVV -p nicintel_spi:pci=0001:01:00.0 -r tet
Using clock_gettime for delay loops (clk_id: 1, resolution: 1ns).
Initializing nicintel_spi programmer
Found “Intel I210 Gigabit Network Connection Unprogrammed” (8086:1531, BDF 01:0.
PCI header type 0x00
Requested BAR is of type MEMMEM BAR access requested, but device has MEM space .
, 32bit, not prefetchable
PCI header type 0x00
Requested BAR is of type MEMMEM BAR access requested, but device has MEM space .
, 32bit, not prefetchable
page_size=1000
pre-rounding: start=0x0000000040012000, len=0x1000, end=0x0000000040013000
post-rounding: start=0x0000000040012000, len=0x1000, end=0x0000000040013000
[ 57.897433] CPU0: SError detected, daif=140, spsr=0x80000000, mpidr=800000000
[ 57.897459] CPU3: SError detected, daif=1c0, spsr=0x80c000c5, mpidr=800001010
[ 57.897477] CPU1: SError detected, daif=1c0, spsr=0x80c000c5, mpidr=800000010
[ 57.897490] CPU2: SError detected, daif=1c0, spsr=0x80c000c5, mpidr=800001000
[ 57.897719] ras_ccplex_serr_callback: Scanning CCPLEX Error Records for Uncos
[ 57.897775] **************************************
[ 57.897784] RAS Error in SCF:SNOC, ERRSELR_EL1=1026:
[ 57.897788] Status = 0xf400a20d
[ 57.897795] IERR = Uncorrectable Carveout Error: 0xa2
[ 57.897802] SERR = Illegal address (software fault): 0xd
[ 57.897806] Uncorrectable (this is fatal)
[ 57.897824] MISC0 = 0x804
[ 57.897828] MISC1 = 0x2992800000800
[ 57.897885] ADDR = 0x800000004001201c
[ 57.897985] **************************************
[ 57.898013] ras_corecluster_serr_callback:Scanning CoreCluster Error Recordss
[ 57.898035] **************************************
[ 57.898039] RAS Error in L2, ERRSELR_EL1=512:
[ 57.898043] Status = 0xf400640d
[ 57.898048] IERR = SCF to L2 Decode Error Read: 0x64
[ 57.898052] SERR = Illegal address (software fault): 0xd
[ 57.898055] Uncorrectable (this is fatal)
[ 57.898076] MISC0 = 0x80000000100000
[ 57.898079] MISC1 = 0x20240000002
[ 57.898128] ADDR = 0x800000004001201c
[ 57.898171] **************************************
[ 57.898239] ras_core_serr_callback: Scanning Core Error Records for Uncorrecs
[ 57.898383] Bad mode in Error handler detected on CPU3, code 0xbe000000 – Sr
[ 57.898393] Internal error: Oops - bad mode: 0 [#1] PREEMPT SMP
[ 57.898456] Modules linked in: bnep fuse overlay zram nvgpu bluedroid_pm ip_s
[ 57.898514] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.9.140-tegra #1
[ 57.898519] Hardware name: Jetson-AGX (DT)
[ 57.898528] task: ffffffc3ee3c4600 task.stack: ffffffc3ee3d8000
[ 57.898554] PC is at t19x_cpu_enter_state+0x4c/0x118
[ 57.898559] LR is at t19x_cpu_enter_state+0x1c/0x118
[ 57.898564] pc : [] lr : [] pstate: 80c05
[ 57.898568] sp : ffffffc3ee3dbe80
[ 57.898578] x29: ffffffc3ee3dbe80 x28: 0000000000000001
[ 57.898588] x27: ffffff8009e56000 x26: ffffff8009821a48
[ 57.898598] x25: 0000000000000000 x24: 0000000d7acbb480
[ 57.898607] x23: ffffff8009fca820 x22: ffffffc3ffdeea50
[ 57.898617] x21: ffffff8009fca838 x20: ffffff800a195870
[ 57.898626] x19: 0000000000000000 x18: 0000000000000000
[ 57.898635] x17: 0000007fb66b4748 x16: ffffff80082b2758
[ 57.898645] x15: 0000000000000000 x14: 0000000000322c65
[ 57.898656] x13: 000000000000c1cb x12: 071c71c71c71c71c
[ 57.898666] x11: 000000000000000b x10: 0101010101010101
[ 57.898677] x9 : fffffffffffffffe x8 : 7f7f7f7f7f7f7f7f
[ 57.898686] x7 : fefefeff646c606d x6 : 00170401e9e1acf4
[ 57.898696] x5 : 742c616901041700 x4 : 8080808000000000
[ 57.898706] x3 : b34b234b0963a000 x2 : 000000000000000b
[ 57.898715] x1 : 0000000000000000 x0 : 0000000000000000
[ 57.898719]
[ 57.898745] Process swapper/3 (pid: 0, stack limit = 0xffffffc3ee3d8000)
[ 57.898750] Call trace:
[ 57.898761] [] t19x_cpu_enter_state+0x4c/0x118
[ 57.898773] [] cpuidle_enter_state+0x84/0x380
[ 57.898780] [] cpuidle_enter+0x34/0x48
[ 57.898792] ras_ccplex_serr_callback: Scanning CCPLEX Error Records for Uncos
[ 57.898805] [] call_cpuidle+0x44/0x70
[ 57.898812] [] cpu_startup_entry+0x1b0/0x200
[ 57.898826] [] secondary_start_kernel+0x190/0x1f8
[ 57.898831] [<0000000080f4f1a4>] 0x80f4f1a4
[ 57.898858] ras_corecluster_serr_callback:Scanning CoreCluster Error Recordss
[ 57.898878] —[ end trace 4d0b70969ae1c7ae ]—
[ 57.901366] ras_core_serr_callback: Scanning Core Error Records for Uncorrecs
[ 57.901650] ras_ccplex_serr_callback: Scanning CCPLEX Error Records for Uncos
[ 57.901699] ras_corecluster_serr_callback:Scanning CoreCluster Error Recordss
[ 57.901760] ras_core_serr_callback: Scanning Core Error Records for Uncorrecs
[ 57.911999] Kernel panic - not syncing: Attempted to kill the idle task!
[ 57.912018] SMP: stopping secondary CPUs
[ 57.912105] Kernel Offset: disabled
[ 57.912111] Memory Limit: none
[ 58.291475] trusty-log panic notifier - trusty version Built: 21:17:12 Aug 1
[ 58.291478] Rebooting in 5 seconds…

Thanks
BR

Hi,

Why do you mention “UPHY1” here? Is that UPHY_TX/RX1 of AGX xavier lane?

Hi Wayne,

Thanks for your reply.
Yes.

Thanks

Hi jiangch0126,

But UPHY1 on xavier is designed for usb.

Hi Wayne,

Sure.
It’s on our carrier board, UPHY1 is used for PCIe ethernet, I210.
We don’t need USB3.0

The error occurs when I wanna flash the external SPI flash for I210.

Thanks

Hi,

It would not work. We don’t provide xaiver to change the uphy functionality from usb to pcie.

Hi Wayne,

Sorry for the mistake… it’s UPHY0 used for PCIe ethernet, I210.
Please help me to point out why SError occurs.

Thanks

This issue looks similar to yours.

https://devtalk.nvidia.com/default/topic/1062004

Which pinmux file are you using?

Hi Wayne,

Thanks for your reply.
I’m using the legacy R32.2.1 load for JAX with our carrier board.
I think the pinmux cfg should be “tegra19x-mb1-pinmux-p2888-0000-a04-p2822-0000-b01.cfg”.

Thanks
BR

Didn’t you change any pinmux by using the pinmux spreadsheet?

Hi Wayne,

Thanks for your reply.
Nop.
The load is legacy R32.2.1.
As described above, the difference is we use PCIe ethernet rather than eSATA bridge on UPHY0.
The device had been detected via lspci.
To make it work, I want to fresh the external SPI flash for I210.
Then the error and reboot occured.

Should I do some update according to the latest pinmux sheet?

Thanks

We are still checking it internally. Thanks.

Hi WanyeWWW

We face exact the same problem with an I210 device, which is connected to UPHY8. The tool “flashrom” crashes as described in post #1 by jiangch0126. The same happens with the programming tool “Eeprom Access Tool V0.7.8” from Intel. Do you have any news on this topic?
Thank you.

Kind regards

Hi WayneWWW

This seems a problem of the Jetson Xavier Module only. We tested both tools (flashrom and eepromaccesstool) on a TX2 system with the same Linux4Tegra R32.2.1 and have no errors. The Pinmux does not seem to be the problem as we also tested the behavior on the Jetson Xavier module with the standard devkit pinmux.

Kind regards

Thanks for update. We are still investigating this issue.

Hi Wayne,
Thanks a lot for your investigation.
I post some info might be helpful for you.
There’re 6 BARs. I found the default BAR used by flashrom is BAR0.
while using BAR0, the virtual address seems to be the bus bridge address (0x40000000)
if I changed to the NIC’s address (0x1200000000), no crash occurs.

@sevm89, if it is so urgent for your project to fix this issue, you can prog external flash first, then use it on the pcb.

Thanks

Can you please share your ‘sudo lspci -vvvv’ output?
There is a translation between the System address (a.k.a Physical address) where the BAR of an endpoint gets accessed and the Bus address (that is sent out on the PCIe bus)
In this case, 0x12_0000_0000 is the System address and 0x4000_0000 is the Bus address.
The ‘lspci’ tool in fact reports BAR0 address as 0x12_0000_0000 and NOT as 0x4000_0000
I’m not sure how the flashrom tool is getting 0x4000_0000 address. As you have already tried out, the correct address to be used here is 0x12_0000_0000.

Hi vidyas

Here is our output of ‘lspci -vvvv’:

0004:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad1 (rev a1) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 39
	Bus: primary=00, secondary=01, subordinate=ff, sec-latency=0
	I/O behind bridge: 00000000-00000fff
	Memory behind bridge: 40000000-40bfffff
	Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
	BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
		Address: 0000000000000000  Data: 0000
		Masking: 00000000  Pending: 00000000
	Capabilities: [70] Express (v2) Root Port (Slot-), MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0
			ExtTag- RBE+
		DevCtl:	Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 256 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 16GT/s, Width x4, ASPM not supported, Exit Latency L0s <1us, L1 <64us
			ClockPM- Surprise+ LLActRep+ BwNot+ ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt+ AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt-
		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible+
		RootCap: CRSVisible+
		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
		DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Not Supported ARIFwd-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled ARIFwd-
		LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [b0] MSI-X: Enable- Count=8 Masked-
		Vector table: BAR=2 offset=00000000
		PBA: BAR=2 offset=00010000
	Capabilities: [100 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
	Capabilities: [148 v1] #19
	Capabilities: [168 v1] #26
	Capabilities: [18c v1] #27
	Capabilities: [1ac v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2- ASPM_L1.1- L1_PM_Substates+
			  PortCommonModeRestoreTime=60us PortTPowerOnTime=40us
		L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
			   T_CommonMode=10us
		L1SubCtl2: T_PwrOn=10us
	Capabilities: [1bc v1] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?>
	Capabilities: [2bc v1] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
	Capabilities: [2f4 v1] #25
	Capabilities: [300 v1] Precision Time Measurement
		PTMCap: Requester:+ Responder:+ Root:+
		PTMClockGranularity: 16ns
		PTMControl: Enabled:- RootSelected:-
		PTMEffectiveGranularity: Unknown
	Capabilities: [30c v1] Vendor Specific Information: ID=0004 Rev=1 Len=054 <?>
	Kernel driver in use: pcieport

0004:01:00.0 Ethernet controller: Intel Corporation Device 1531 (rev 03)
	Subsystem: Intel Corporation Device 0000
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin A routed to IRQ 0
	Region 0: Memory at 1740000000 (32-bit, non-prefetchable) [disabled] 
	Region 2: I/O ports at 300000 [disabled] 
	Region 3: Memory at 1740800000 (32-bit, non-prefetchable) [disabled] 
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
		Address: 0000000000000000  Data: 0000
		Masking: 00000000  Pending: 00000000
	Capabilities: [70] MSI-X: Enable- Count=5 Masked-
		Vector table: BAR=3 offset=00000000
		PBA: BAR=3 offset=00002000
	Capabilities: [a0] Express (v2) Endpoint, MSI 00
		DevCap:	MaxPayload 512 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
			MaxPayload 256 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #4, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s unlimited, L1 <16us
			ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Via WAKE#
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [100 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
	Capabilities: [140 v1] Device Serial Number 00-a0-c9-ff-ff-00-00-00
	Capabilities: [1a0 v1] Transaction Processing Hints
		Device specific mode supported
		Steering table in TPH capability structure
	Capabilities: [1c0 v1] Latency Tolerance Reporting
		Max snoop latency: 0ns
		Max no snoop latency: 0ns

Hi jiangch0126

Can you describe for us how you changed the virtual address in the flashrom tool?
The programming of the chip on another system is our last option.
Thank you for your help.

In your case address 1740000000 should be used.