jetson-tx1 pcie2sata connect hdd disk error with ahci msi as interrupt

[ 10.975596] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 10.984728] mc-err: (0) csw_afiw: EMEM address decode error
[ 10.992960] mc-err: status = 0x20010031; addr = 0x7e5c4000
[ 11.001355] mc-err: secure: no, access-type: write, SMMU fault: none
[ 20.975583] ata1.00: qc timeout (cmd 0xec)
[ 20.982356] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 20.991113] ata1: limiting SATA link speed to 3.0 Gbps
[ 21.545592] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[ 21.554708] mc-err: (0) csw_afiw: EMEM address decode error
[ 21.562969] mc-err: status = 0x20010031; addr = 0x7e5c4000
[ 21.571320] mc-err: secure: no, access-type: write, SMMU fault: none
[ 51.545581] ata1.00: qc timeout (cmd 0xec)
[ 51.552379] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 52.105598] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)

Which code base release are you using?
It looks like this is happening because SMMU is not enabled for PCIe.
Can you give me the output of ‘cat /sys/kernel/debug/70019000.iommu/masters/’ ?

Also, what is the make and model of the pcie2sata card? ‘lspci’ output would do.

root@tegra-ubuntu:~# cd /sys/kernel/debug/70019000.iommu/masters/
root@tegra-ubuntu:/sys/kernel/debug/70019000.iommu/masters# ls
546c0000.i2c nvjpg
70006000.serial sdhci-tegra.0
70006040.serial sdhci-tegra.3
70006200.serial serial8250
70006300.serial sound.27
7000c000.i2c spdif-dit.0
7000c400.i2c spdif-dit.1
7000c500.i2c spdif-dit.2
7000c700.i2c spdif-dit.3
7000d000.i2c spdif-dit.4
7000d100.i2c tegra21-se
7000d400.spi tegra30-hda
7000da00.spi tegra-carveouts.23
702ef000.adsp tegradc.0
adsp_audio.3 tegradc.1
bpmp.24 tegra-fuse
flush_all_threshold_map_pages tegra-otg
flush_all_threshold_unmap_pages tegra-sata.0
gpu.0 tegra-udc.0
host1x tegra-xhci
isp.0 tsec
isp.1 tsecb
mc vi
msenc vic03
nvdec
root@tegra-ubuntu:/sys/kernel/debug/70019000.iommu/masters#
root@tegra-ubuntu:/sys/kernel/debug/70019000.iommu/masters#
root@tegra-ubuntu:/sys/kernel/debug/70019000.iommu/masters# lspci -vvv
00:01.0 PCI bridge: NVIDIA Corporation Device 0fae (rev a1) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
I/O behind bridge: 00001000-00001fff
Memory behind bridge: 13000000-130fffff
Prefetchable memory behind bridge: 0000000020000000-00000000200fffff
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
BridgeCtl: Parity+ SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: [40] Subsystem: NVIDIA Corporation Device 0000
Capabilities: [48] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable- Count=1/2 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [60] HyperTransport: MSI Mapping Enable- Fixed-
Mapping Address Base: 00000000fee00000
Capabilities: [80] Express (v2) Root Port (Slot+), MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0
ExtTag+ RBE+
DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x4, ASPM L0s L1, Exit Latency L0s <512ns, L1 <4us
ClockPM- Surprise- LLActRep+ BwNot+
LnkCtl: ASPM L0s Enabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s, Width x2, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt-
SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
Slot #0, PowerLimit 0.000W; Interlock- NoCompl-
SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
Control: AttnInd Off, PwrInd On, Power- Interlock-
SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
Changed: MRL- PresDet+ LinkState+
RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible-
RootCap: CRSVisible-
RootSta: PME ReqID 0000, PMEStatus- PMEPending-
DevCap2: Completion Timeout: Range AB, TimeoutDis+, LTR+, OBFF Not Supported ARIFwd-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled ARIFwd-
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
Capabilities: [140 v1] L1 PM Substates
L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
PortCommonModeRestoreTime=30us PortTPowerOnTime=70us
Kernel driver in use: pcieport

01:00.0 SATA controller: Marvell Technology Group Ltd. Device 9235 (rev 11) (prog-if 01 [AHCI 1.0])
Subsystem: Marvell Technology Group Ltd. Device 9235
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 549
Region 0: I/O ports at 1020
Region 1: I/O ports at 1030
Region 2: I/O ports at 1028
Region 3: I/O ports at 1034
Region 4: I/O ports at 1000
Region 5: Memory at 13000000 (32-bit, non-prefetchable)
Expansion ROM at 20000000 [disabled]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit-
Address: 7e592000 Data: 0000
Capabilities: [70] Express (v2) Legacy Endpoint, MSI 00
DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <1us, L1 <8us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x2, ASPM L0s L1, Exit Latency L0s <512ns, L1 <64us
ClockPM- Surprise- LLActRep- BwNot-
LnkCtl: ASPM L0s Enabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s, Width x2, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [e0] SATA HBA v0.0 BAR4 Offset=00000004
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP+ Rollover- Timeout+ NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
Kernel driver in use: ahci

hi vidyas:

head -n 1 /etc/nv_te

R24 (release), REVISION: 1.0, GCID: 7164062, BOARD: t210ref, EABI: aarch64, DATE: Tue May 17 23:37:30 UTC 2016

hi vidyas:

head -n 1 /etc/nv_te

R24 (release), REVISION: 1.0, GCID: 7164062, BOARD: t210ref, EABI: aarch64, DATE: Tue May 17 23:37:30 UTC 2016

hi vidyas: thank you replay, /sys/kernel/debug/70019000.iommu/masters/ is directory.

It is confirmed that SMMU is not enabled for PCIe.
Can you tell me the exact version in Rel-24 (like 24.1 / 24.2 Etc…) so that I can give a patch to enable SMMU?

dear vidyas , I use kernel is R24.1 64bit version , R24.2 version can not find at nvidia download , before kernel R23 also same question.


best regard!

You can find R23.2 here:
https://developer.nvidia.com/embedded/linux-tegra-r232

R24.2 is not out yet…this URL is for R24.1, but will probably have R24.2 when it is out:
https://developer.nvidia.com/embedded/linux-tegra

hi,vidyas, I am waiting SMMU patch for this question,thanks.

Please apply following patches to enable SMMU for PCIe

--- a/arch/arm64/boot/dts/tegra210-soc-base.dtsi
+++ b/arch/arm64/boot/dts/tegra210-soc-base.dtsi
@@ -1252,6 +1252,7 @@
                          0x82000000 0 0x13000000 0x0 0x13000000 0 0x0d000000   /* non-prefetchable memory (208 MiB) */
                          0xc2000000 0 0x20000000 0x0 0x20000000 0 0x20000000>; /* prefetchable memory (512 MiB) */

+               iommus = <&smmu TEGRA_SWGROUP_AFI>;
                status = "disabled";

                pci@1,0 {
--
--- a/drivers/iommu/of_tegra-smmu.c
+++ b/drivers/iommu/of_tegra-smmu.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2014-2015, NVIDIA CORPORATION.  All rights reserved.
+ * Copyright (c) 2014-2016, NVIDIA CORPORATION.  All rights reserved.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms and conditions of the GNU General Public License,
@@ -164,14 +164,16 @@ u64 tegra_smmu_of_get_swgids(struct device *dev,
 {
        struct of_phandle_iter iter;
        u64 fixup, swgids = 0;
+       struct device_node *np = dev->of_node;

        if (dev_is_pci(dev)) {
-               return SWGIDS_ERROR_CODE;
-               swgids = TEGRA_SWGROUP_BIT(AFI);
-               goto try_fixup;
+               struct pci_bus *bus = to_pci_dev(dev)->bus;
+               if (!pci_is_root_bus(bus))
+                       dev = bus->bridge;
+               np = of_get_parent(dev->of_node);
        }

-       of_property_for_each_phandle_with_args(iter, dev->of_node, "iommus",
+       of_property_for_each_phandle_with_args(iter, np, "iommus",
                                               "#iommu-cells", 0) {
                struct of_phandle_args *ret = &iter.out_args;

@@ -187,9 +189,11 @@ u64 tegra_smmu_of_get_swgids(struct device *dev,
                swgids |= (1ULL << ret->args[0]);
        }

+       if (dev_is_pci(dev))
+               of_node_put(np);
+
        swgids = swgids ? swgids : SWGIDS_ERROR_CODE;

-try_fixup:
        fixup = tegra_smmu_fixup_swgids(dev, area);

        if (swgids_is_error(fixup))
--

also, please disable ASPM as only L0s seems to be getting enabled with your end point and that is known to create issues. You can disable it by
-> Disabling in configs
-> Pass ‘pcie_aspm=off’ in kernel command line
-> Execute echo “performance” > /sys/module/pcie_aspm/parameters/policy before loading your module

dear vidyas :
I patch you give me;

root@tegra-ubuntu:~# cat /proc/cmdline
fbcon=map:0 console=tty0 console=ttyS0,115200n8 pcie_aspm=off

PCIE Host Controller Drivers

CONFIG_PCI_TEGRA=y
CONFIG_ARCH_TEGRA_HAS_PCIE=y

CONFIG_PCIEASPM_PERFORMANCE=y
CONFIG_PCIE_PME=y

  then insmod ahci.ko

this is log:

root@tegra-ubuntu:~#
root@tegra-ubuntu:~#
root@tegra-ubuntu:~# cat /proc/cmdline
fbcon=map:0 console=tty0 console=ttyS0,115200n8 pcie_aspm=off cma=320M coherent_pool=160M androidboot.modem=none androidboot.serialno=P2180A00P00940c003fd androidboot.security=non-secure tegraid=21.1.2.0.0t
root@tegra-ubuntu:~#
root@tegra-ubuntu:~# cd /sys/kernel/debug/70019000.iommu/masters/
root@tegra-ubuntu:/sys/kernel/debug/70019000.iommu/masters# ls
0000:00:01.0 bpmp.24 spdif-dit.2
1003000.pcie-controller flush_all_threshold_map_pages spdif-dit.3
546c0000.i2c flush_all_threshold_unmap_pages spdif-dit.4
70006000.serial gpu.0 tegra21-se
70006040.serial host1x tegra30-hda
70006200.serial isp.0 tegra-carveouts.23
70006300.serial isp.1 tegradc.0
7000c000.i2c mc tegradc.1
7000c400.i2c msenc tegra-fuse
7000c500.i2c nvdec tegra-otg
7000c700.i2c nvjpg tegra-sata.0
7000d000.i2c sdhci-tegra.0 tegra-udc.0
7000d100.i2c sdhci-tegra.3 tegra-xhci
7000d400.spi serial8250 tsec
7000da00.spi sound.27 tsecb
702ef000.adsp spdif-dit.0 vi
adsp_audio.3 spdif-dit.1 vic03
root@tegra-ubuntu:/sys/kernel/debug/70019000.iommu/masters#
root@tegra-ubuntu:/sys/kernel/debug/70019000.iommu/masters# cd
root@tegra-ubuntu:~#
root@tegra-ubuntu:~#
root@tegra-ubuntu:~# insmod ahci.ko
[ 124.123787] tegra_msi_setup_irq enter.
[ 124.127992] tegra_msi_alloc enter.
[ 124.131699] tegra_msi_alloc out.
[ 124.135976] tegra_msi_map enter.
[ 124.139420] tegra_msi_map out.
[ 124.142647] tegra_msi_setup_irq out.
[ 124.717048] smmu_dump_pagetable(): fault_address=0x000000007e5a5000 pa=0xffffffffffffffff bytes=ffffffffffffffff #pte=0 in L2
[ 124.730264] mc-err: (0) csw_afiw: EMEM decode error on PDE or PTE entry
[ 124.738151] mc-err: status = 0x60010031; addr = 0x7e5a5000
[ 124.743922] mc-err: secure: no, access-type: write, SMMU fault: nr-nw-s
[ 130.267105] smmu_dump_pagetable(): fault_address=0x000000007e5a5000 pa=0xffffffffffffffff bytes=ffffffffffffffff #pte=0 in L2
[ 130.280336] mc-err: (0) csw_afiw: EMEM decode error on PDE or PTE entry
[ 130.288266] mc-err: status = 0x60010031; addr = 0x7e5a5000
[ 130.294031] mc-err: secure: no, access-type: write, SMMU fault: nr-nw-s
[ 140.817154] smmu_dump_pagetable(): fault_address=0x000000007e5a5000 pa=0xffffffffffffffff bytes=ffffffffffffffff #pte=0 in L2
[ 140.830338] mc-err: (0) csw_afiw: EMEM decode error on PDE or PTE entry
[ 140.838183] mc-err: status = 0x60010031; addr = 0x7e5a5000
[ 140.843945] mc-err: secure: no, access-type: write, SMMU fault: nr-nw-s

root@tegra-ubuntu:~#

root@tegra-ubuntu:~# dmesg -c
[ 124.123560] ahci 0000:01:00.0: version 3.0
[ 124.123787] tegra_msi_setup_irq enter.
[ 124.127958] PCIE: tegra_msi_setup_irq(2759)
[ 124.127992] tegra_msi_alloc enter.
[ 124.131676] PCIE: tegra_msi_alloc(2676)
[ 124.131699] tegra_msi_alloc out.
[ 124.135976] tegra_msi_map enter.
[ 124.139399] PCIE: tegra_msi_map(2805)
[ 124.139420] tegra_msi_map out.
[ 124.142647] tegra_msi_setup_irq out.
[ 124.165526] ahci 0000:01:00.0: AHCI 0001.0000 32 slots 4 ports 6 Gbps 0xf impl SATA mode
[ 124.165569] ahci 0000:01:00.0: flags: 64bit ncq sntf led only pmp fbs pio slum part sxs
[ 124.165595] ahci 0000:01:00.0: enabling bus mastering
[ 124.170539] scsi0 : ahci
[ 124.171484] scsi1 : ahci
[ 124.172207] scsi2 : ahci
[ 124.172871] scsi3 : ahci
[ 124.173288] ata1: SATA max UDMA/133 abar m2048@0x13000000 port 0x13000100 irq 549
[ 124.173314] ata2: SATA max UDMA/133 abar m2048@0x13000000 port 0x13000180 irq 549
[ 124.173337] ata3: SATA max UDMA/133 abar m2048@0x13000000 port 0x13000200 irq 549
[ 124.173358] ata4: SATA max UDMA/133 abar m2048@0x13000000 port 0x13000280 irq 549
[ 124.515663] ata2: SATA link down (SStatus 0 SControl 300)
[ 124.526189] ata3: SATA link down (SStatus 0 SControl 300)
[ 124.526472] ata4: SATA link down (SStatus 0 SControl 300)
[ 124.715842] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 124.717048] smmu_dump_pagetable(): fault_address=0x000000007e5a5000 pa=0xffffffffffffffff bytes=ffffffffffffffff #pte=0 in L2
[ 124.730264] mc-err: (0) csw_afiw: EMEM decode error on PDE or PTE entry
[ 124.738151] mc-err: status = 0x60010031; addr = 0x7e5a5000
[ 124.743922] mc-err: secure: no, access-type: write, SMMU fault: nr-nw-s
[ 129.716091] ata1.00: qc timeout (cmd 0xec)
[ 129.716207] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 130.265883] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 130.267105] smmu_dump_pagetable(): fault_address=0x000000007e5a5000 pa=0xffffffffffffffff bytes=ffffffffffffffff #pte=0 in L2
[ 130.280336] mc-err: (0) csw_afiw: EMEM decode error on PDE or PTE entry
[ 130.288266] mc-err: status = 0x60010031; addr = 0x7e5a5000
[ 130.294031] mc-err: secure: no, access-type: write, SMMU fault: nr-nw-s
[ 133.392581] init: alsa-restore main process (1486) terminated with status 19
[ 133.448012] init: plymouth-stop pre-start process (1555) terminated with status 1
[ 140.266080] ata1.00: qc timeout (cmd 0xec)
[ 140.266196] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 140.266276] ata1: limiting SATA link speed to 3.0 Gbps
[ 140.815952] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[ 140.817154] smmu_dump_pagetable(): fault_address=0x000000007e5a5000 pa=0xffffffffffffffff bytes=ffffffffffffffff #pte=0 in L2
[ 140.830338] mc-err: (0) csw_afiw: EMEM decode error on PDE or PTE entry
[ 140.838183] mc-err: status = 0x60010031; addr = 0x7e5a5000
[ 140.843945] mc-err: secure: no, access-type: write, SMMU fault: nr-nw-s
[ 170.816562] ata1.00: qc timeout (cmd 0xec)
[ 170.816680] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 171.366405] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)

root@tegra-ubuntu:/sys/module/pcie_aspm/parameters# cat policy
[default] performance powersave

-bash: echo: write error: Operation not permitteds# echo “performance” > policy
root@tegra-ubuntu:/sys/module/pcie_aspm/parameters#

Are you using upstreamed driver for your device? If yes, Can you please point me to the source code path?

hi,vidyas:
I use /driver/ata/ahci.c compile ahci.ko ,then insmod ahci.ko ,I hope msi as pci msi interrupt.

thanks.

Hi,
I’ve found similar Marvel controller based PCIe2SATA add-on card and able to reproduce the error.
It seems the issue occurs only if MSI interrupts are enabled, whereas there is no issue with legacy interrupts.
Also, I’ve tested two different SiiG (Silicon Image SATA controller) based PCIe2SATA cards with MSI interrupts enabled and they worked fine.
So, only Marvel controller cards seem to have issues with MSI interrupts enabled.
for time being, to unblock yourself, you can apply following patch which disables MSI interrupts (hence, legacy interrupt will be used by default)

diff --git a/drivers/pci/host/pci-tegra.c b/drivers/pci/host/pci-tegra.c
index 094cfd6e..17c8c44 100644
--- a/drivers/pci/host/pci-tegra.c
+++ b/drivers/pci/host/pci-tegra.c
@@ -2587,7 +2587,7 @@ static int tegra_pcie_disable_msi(struct tegra_pcie *pcie);
 static int tegra_pcie_init(struct tegra_pcie *pcie)
 {
    int err = 0;
-   struct platform_device *pdev = to_platform_device(pcie->dev);
+// struct platform_device *pdev = to_platform_device(pcie->dev);

    pcibios_min_io = 0x1000ul;

@@ -2614,7 +2614,7 @@ static int tegra_pcie_init(struct tegra_pcie *pcie)
    }
    /* setup the AFI address translations */
    tegra_pcie_setup_translations(pcie);
-
+#if 0
    if (IS_ENABLED(CONFIG_PCI_MSI)) {
        err = tegra_pcie_enable_msi(pcie, false);
        if (err < 0) {
@@ -2624,7 +2624,7 @@ static int tegra_pcie_init(struct tegra_pcie *pcie)
            goto fail_release_resource;
        }
    }
-
+#endif
    tegra_periph_reset_deassert(pcie->pcie_pcie);

    tegra_pcie_check_ports(pcie);

Meanwhile, we’ll debug the issue and come up with a proper solution

I’ve root caused the issue and following change would fix that. You can revert the previous change that disables MSI and use the below change.

diff --git a/drivers/pci/host/pci-tegra.c b/drivers/pci/host/pci-tegra.c
index 6c070a9..6c06a76 100644
--- a/drivers/pci/host/pci-tegra.c
+++ b/drivers/pci/host/pci-tegra.c
@@ -2791,7 +2791,7 @@ static int tegra_pcie_enable_msi(struct tegra_pcie *pcie, bool no_init)
        }

        /* setup AFI/FPCI range */
-       msi->pages = __get_free_pages(GFP_KERNEL, 0);
+       msi->pages = __get_free_pages(GFP_DMA32, 0);
    }
    base = virt_to_phys((void *)msi->pages);

I am curious about MSI…when this produces an interrupt, is the TX1’s architecture such that the interrupt can be serviced on any CPU, or does this require CPU0? If this could distributed to multiple CPU cores I could see how this could be used for faster hardware IRQ handling (at least under PCIe) without IRQ starvation.

If irqbalance daemon is run in user space, interrupts can be handled on other CPUs as well. Please note at Tegra hardware level, there is only one interrupt for all MSIs (there is an MSI aggregator which finally raises only one interrupt to CPU)

Hey, I tried these suggestions but I haven’t had any luck getting my Marvell based HBA to work.
TX1 running 24.2 with the patched kernel. (Where appropriate, some of the suggested patches already existed)
lspci output:

06:00.0 SATA controller: Marvell Technology Group Ltd. Device 9215 (rev 11) (prog-if 01 [AHCI 1.0])
	Subsystem: Marvell Technology Group Ltd. Device 9215
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR+ <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 554
	Region 0: I/O ports at 4020 
	Region 1: I/O ports at 4030 
	Region 2: I/O ports at 4028 
	Region 3: I/O ports at 4034 
	Region 4: I/O ports at 4000 
	Region 5: Memory at 13600000 (32-bit, non-prefetchable) 
	Expansion ROM at 20600000 [disabled] 
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit-
		Address: 83400000  Data: 0007
	Capabilities: [70] Express (v2) Legacy Endpoint, MSI 00
		DevCap:	MaxPayload 512 bytes, PhantFunc 0, Latency L0s <1us, L1 <8us
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- UncorrErr+ FatalErr- UnsuppReq- AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <512ns, L1 <64us
			ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
		LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [e0] SATA HBA v0.0 BAR4 Offset=00000004
	Capabilities: [100 v1] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO+ CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 0e, GenCap- CGenEn- ChkCap- ChkEn-
	Kernel driver in use: ahci

dmesg | grep -i ata:

[   14.594760] ata14: softreset failed (1st FIS failed)
[   24.644754] ata14: softreset failed (1st FIS failed)
[   59.694762] ata14: softreset failed (1st FIS failed)
[   59.743260] ata14: limiting SATA link speed to 1.5 Gbps
[   64.954818] ata14: softreset failed (device not ready)
[   64.962780] ata14: reset failed, giving up

any ideas what might fix this? I’m just wanting to be able to talk to a hard drive thats connected to the HBA via a port multiplier