pcie print error .can't insmod pcie device driver

backgroud:
1.board:tx2 jetpack3.3 xilinx v6
request.We use xilinx’s DMA Subsystem for PCI Express communication with tx2
questions:
1.pcie bus report same error and then i cant’t load my pcie device driver.
the error log image is add to attachment
2.Data cannot be effectively written to memory.
Only part of the data is written to memory, All the data with a value of 0 below failed to be written
00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-
00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-
1effa008-1effa008-1effa008-1effa008-1effa009-1effa009-1effa009-1effa009-1effa00a-1effa00a-1effa00a-1effa00a-1effa00b-1effa00b-1effa00b-1effa00b-
00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-
00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-
1effa014-1effa014-1effa014-1effa014-1effa015-1effa015-1effa015-1effa015-1effa016-1effa016-1effa016-1effa016-1effa017-1effa017-1effa017-1effa017-
00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-
00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-
00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-
1effa024-1effa024-1effa024-1effa024-1effa025-1effa025-1effa025-1effa025-1effa026-1effa026-1effa026-1effa026-1effa027-1effa027-1effa027-1effa027-
1effa028-1effa028-1effa028-1effa028-1effa029-1effa029-1effa029-1effa029-1effa02a-1effa02a-1effa02a-1effa02a-1effa02b-1effa02b-1effa02b-1effa02b-
1effa02c-1effa02c-1effa02c-1effa02c-1effa02d-1effa02d-1effa02d-1effa02d-1effa02e-1effa02e-1effa02e-1effa02e-1effa02f-1effa02f-1effa02f-1effa02f-
00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-
1effa034-1effa034-1effa034-1effa034-1effa035-1effa035-1effa035-1effa035-1effa036-1effa036-1effa036-1effa036-1effa037-1effa037-1effa037-1effa037-
1effa038-1effa038-1effa038-1effa038-1effa039-1effa039-1effa039-1effa039-1effa03a-1effa03a-1effa03a-1effa03a-1effa03b-1effa03b-1effa03b-1effa03b-

   My driver is add to attachment.

Please help me, thank you very much

Xilinx_Answer_65444_Linux_Files_rel20180420-source.rar (191 KB)



Hi,
As I see so many physical layer errors, Can you please your setup once and see if there are any loose cables connected?
Also, can you please share the output of ‘sudo lspci -vvvv’?

1、I first turn on xilinx, wait a few seconds, then turn on tx2, giving TX2 enough boot time,physical layer error disappeared.Do I need to modify boot? add a delay?

nvidia@tegra-ubuntu:~$ dmesg | grep pcie
[ 0.140533] node /plugin-manager/fragment-500-e3325-pcie match with board >=3489-0000-200
[ 0.262061] iommu: Adding device 10003000.pcie-controller to group 50
[ 7.605773] tegra-pcie 10003000.pcie-controller: 4x1, 1x1 configuration
[ 7.627634] tegra-pcie 10003000.pcie-controller: PCIE: Enable power rails
[ 7.628054] tegra-pcie 10003000.pcie-controller: probing port 0, using 4 lanes
[ 7.630283] tegra-pcie 10003000.pcie-controller: probing port 2, using 1 lanes
[ 8.046586] tegra-pcie 10003000.pcie-controller: link 2 down, retrying
[ 8.467866] tegra-pcie 10003000.pcie-controller: link 2 down, retrying
[ 8.876544] tegra-pcie 10003000.pcie-controller: link 2 down, retrying
[ 8.887380] tegra-pcie 10003000.pcie-controller: link 2 down, ignoring
[ 8.911416] tegra-pcie 10003000.pcie-controller: PCI host bridge to bus 0000:00
[ 8.912419] pcieport 0000:00:01.0: enabling device (0000 → 0002)
[ 8.912592] pcieport 0000:00:01.0: Signaling PME through PCIe PME interrupt
[ 8.912599] pcie_pme 0000:00:01.0:pcie01: service driver pcie_pme loaded
[ 8.912681] aer 0000:00:01.0:pcie02: service driver aer loaded
[ 8.912865] tegra-pcie 10003000.pcie-controller: speed change : Gen-1 → Gen-2

nvidia@tegra-ubuntu:~$ sudo lspci -vvvv
[sudo] password for nvidia:
00:01.0 PCI bridge: NVIDIA Corporation Device 10e5 (rev a1) (prog-if 00 [Normal decode])
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 388
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
I/O behind bridge: 0000f000-00000fff
Memory behind bridge: 50100000-502fffff
Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: [40] Subsystem: NVIDIA Corporation Device 0000
Capabilities: [48] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable- Count=1/2 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [60] HyperTransport: MSI Mapping Enable- Fixed-
Mapping Address Base: 00000000fee00000
Capabilities: [80] Express (v2) Root Port (Slot+), MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0
ExtTag+ RBE+
DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x4, ASPM L0s L1, Exit Latency L0s <512ns, L1 <4us
ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
Slot #0, PowerLimit 0.000W; Interlock- NoCompl-
SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
Control: AttnInd Off, PwrInd On, Power- Interlock-
SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
Changed: MRL- PresDet+ LinkState+
RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible-
RootCap: CRSVisible-
RootSta: PME ReqID 0000, PMEStatus- PMEPending-
DevCap2: Completion Timeout: Range AB, TimeoutDis+, LTR+, OBFF Not Supported ARIFwd-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled ARIFwd-
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
Capabilities: [140 v1] L1 PM Substates
L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
PortCommonModeRestoreTime=30us PortTPowerOnTime=70us
Kernel driver in use: pcieport

01:00.0 Serial controller: Xilinx Corporation Device 7024 (prog-if 01 [16450])
Subsystem: Xilinx Corporation Device 0007
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 388
Region 0: Memory at 50100000 (32-bit, non-prefetchable)
Region 1: Memory at 50200000 (32-bit, non-prefetchable)
Capabilities: [80] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [b0] MSI-X: Enable+ Count=32 Masked-
Vector table: BAR=1 offset=00008000
PBA: BAR=1 offset=00008fe0
Capabilities: [c0] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x4, ASPM not supported, Exit Latency L0s unlimited, L1 unlimited
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range B, TimeoutDis+, LTR-, OBFF Not Supported
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [100 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
Kernel driver in use: xdma
Kernel modules: xdma

2.I’m very sorry. I used the wrong title.pcie cannot be effectively written to memory,that’s the most important question.All the data with a value of 0 is failed to be written in memory.can you give me some advice? thank you very much

00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-
00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-
1effa008-1effa008-1effa008-1effa008-1effa009-1effa009-1effa009-1effa009-1effa00a-1effa00a-1effa00a-1effa00a-1effa00b-1effa00b-1effa00b-1effa00b-
00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-
00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-
1effa014-1effa014-1effa014-1effa014-1effa015-1effa015-1effa015-1effa015-1effa016-1effa016-1effa016-1effa016-1effa017-1effa017-1effa017-1effa017-
00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-
00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-
00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-
1effa024-1effa024-1effa024-1effa024-1effa025-1effa025-1effa025-1effa025-1effa026-1effa026-1effa026-1effa026-1effa027-1effa027-1effa027-1effa027-
1effa028-1effa028-1effa028-1effa028-1effa029-1effa029-1effa029-1effa029-1effa02a-1effa02a-1effa02a-1effa02a-1effa02b-1effa02b-1effa02b-1effa02b-
1effa02c-1effa02c-1effa02c-1effa02c-1effa02d-1effa02d-1effa02d-1effa02d-1effa02e-1effa02e-1effa02e-1effa02e-1effa02f-1effa02f-1effa02f-1effa02f-
00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-00000000-
1effa034-1effa034-1effa034-1effa034-1effa035-1effa035-1effa035-1effa035-1effa036-1effa036-1effa036-1effa036-1effa037-1effa037-1effa037-1effa037-
1effa038-1effa038-1effa038-1effa038-1effa039-1effa039-1effa039-1effa039-1effa03a-1effa03a-1effa03a-1effa03a-1effa03b-1effa03b-1effa03b-1effa03b-

Since I see “Mem+ BusMaster+”, which means, the endpoint is set with BAR windows enabled and also Bus Mastering enabled, I don’t see any reason why writes are not working. This may be looked into from FPGA point of view. If possible, please connect it to an x86 system and see if you don’t observe the issue there (in which case we’ll debug more from Tegra side). At this point, it looks like FPGA configuration issue.

It’s the driver’s fault.thank you very much