Hi guys, I met a problem when doing reboot test. The reboot test is for test the AQC113 connected on 14160000.pcie
controller could be successfully enumerated. My reboot script is as following:
# ~/.profile
bash ./reboot_test.sh
# ~/reboot_test.sh
if lspci | grep Aquantia;then
echo rebooting
sleep 10
echo password | sudo -S reboot
fi
And I’ve modified the kernel code trying to print some info. Here is the git difference:
diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
index ae511b84b3d8..66ff99ba3ea6 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -417,6 +417,7 @@ int dw_pcie_host_init(struct pcie_port *pp)
goto err_free_msi;
}
+ printk("ben %s: %d %s, %s\n", __FILE__, __LINE__, __func__, "");
/* Ignore errors, the link may come up later */
dw_pcie_wait_for_link(pci);
diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
index 5563979310ba..6de020064313 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -550,6 +550,8 @@ int dw_pcie_wait_for_link(struct dw_pcie *pci)
{
int retries;
+ dev_info(pci->dev, "ben ");
+ dump_stack();
/* Check if the link is up or not */
for (retries = 0; retries < LINK_WAIT_MAX_RETRIES; retries++) {
if (dw_pcie_link_up(pci)) {
@@ -557,6 +559,7 @@ int dw_pcie_wait_for_link(struct dw_pcie *pci)
return 0;
}
usleep_range(LINK_WAIT_USLEEP_MIN, LINK_WAIT_USLEEP_MAX);
+ dev_info(pci->dev, "ben: wait loop (%d/%d)\n", retries+1, LINK_WAIT_MAX_RETRIES);
}
dev_info(pci->dev, "Phy link never came up\n");
@@ -569,10 +572,12 @@ int dw_pcie_link_up(struct dw_pcie *pci)
{
u32 val;
+ //dev_info(pci->dev, "ben %s: %d %s, before get link up\n", __FILE__, __LINE__, __func__);
if (pci->ops && pci->ops->link_up)
return pci->ops->link_up(pci);
val = readl(pci->dbi_base + PCIE_PORT_DEBUG1);
+ dev_info(pci->dev, "ben %s: %d %s, PCIE_PORT_DEBUG1=0x%08X\n", __FILE__, __LINE__, __func__, val);
return ((val & PCIE_PORT_DEBUG1_LINK_UP) &&
(!(val & PCIE_PORT_DEBUG1_LINK_IN_TRAINING)));
}
diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
index 76c57d4fa714..fd4c78dab204 100644
--- a/drivers/pci/controller/dwc/pcie-designware.h
+++ b/drivers/pci/controller/dwc/pcie-designware.h
@@ -25,7 +25,7 @@
#define DW_PCIE_VER_562A 0x3536322A
/* Parameters for the waiting for link up routine */
-#define LINK_WAIT_MAX_RETRIES 10
+#define LINK_WAIT_MAX_RETRIES 100
#define LINK_WAIT_USLEEP_MIN 90000
#define LINK_WAIT_USLEEP_MAX 100000
diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c b/drivers/pci/controller/dwc/pcie-tegra194.c
index a44b477cee27..77fd5cdaaa4b 100644
--- a/drivers/pci/controller/dwc/pcie-tegra194.c
+++ b/drivers/pci/controller/dwc/pcie-tegra194.c
@@ -1548,6 +1548,7 @@ static int tegra_pcie_dw_link_up(struct dw_pcie *pci)
{
struct tegra_pcie_dw *pcie = to_tegra_pcie(pci);
u32 val = dw_pcie_readw_dbi(pci, pcie->pcie_cap_base + PCI_EXP_LNKSTA);
+ dev_info(pci->dev, "%s: val=%u", __func__, val);
return !!(val & PCI_EXP_LNKSTA_DLLLA);
}
The problem as following: I should catch the case that the AQC113 10G netcard is not enumerated by the kernel, and then I should be in interactive shell through debug serial. But I never met this case, and I met this problem. The end print from kernel is as following:
[ 29.207841] pcieport 0008:00:00.0: Adding to iommu group 8
[ 29.207922] pcieport 0008:00:00.0: PME: Signaling with IRQ 186
[ 29.208222] pcieport 0008:00:00.0: AER: enabled with IRQ 186
Comparing to normal boot, what should be printed in the next is:
[ 7.922666] systemd[1]: systemd 249.11-0ubuntu3.11 running in system mode (+PAM +AUDIT +SELINUX +APPARMOR +IMA +SMAC)
[ 7.922979] systemd[1]: Detected architecture arm64.
[ 8.029953] systemd[1]: Hostname set to <V701>.
[ 8.049820] systemd[1]: Using hardware watchdog 'NVIDIA Tegra186 WDT', version 0, device /dev/watchdog
[ 8.049837] systemd[1]: Set hardware watchdog to 2min.
...
Here is the full log from reboot.
boot_stuck.log (96.6 KB)