PCIe USB3.0 extension on jetson nano

hi all

I am new to jetson and like the system a lot better than other embedded systems. However I am stuck with my project.
I have a question regarding the pcie x1 via m.2 capabilities of the jetson nano 4gb on the standard nvidia dev board. I am using jetpack 4.5.1.

I connected a standard pcie usb3.0 card (7 Pt PCI Express USB 3.0 Card - Std & LP - USB 3.0 Cards | Germany) on the jetson nano m.2 key a/e connector via two adapters from m.2 to miniPcie (https://www.amazon.de/dp/B07JCPMKQ8/ref=cm_sw_r_cp_apa_i_AC6P1AG2HDCD95YFR26N), and from miniPcie to Pcie (https://www.amazon.de/dp/B07ST256GJ/ref=cm_sw_r_cp_apa_i_8J1YDVVH5VJ32WYYJ4MK)
My aim is to use the card to connect 6 usb3 cameras (basler aca5472-17um) at once to the jetson nano.

In the first place connecting the usb card stopped the system from booting, showing only sth like
Pcibus receiver error: corrected…
I couldn’t copy paste or photograph the message, because it was just spammed on the screen very fast.

After some googling and adding pcie=noaer to the boot options, the system would boot and display the card, as well as the 6 cameras. I can connect to the cameras and set acquisition parameters, but as soon, as i start recording, after 0-5 frames the whole OS would freeze and never recover. Again I don’t have an error message. Even watch & dmesg wouldn’t help, as the system seems to freeze before reporting the error.

When i use a external usb3 hub connected to the onboard usb3 of the dev board or the usb3 of the dev board directly, the cameras work flawlessly.

The usb card and pcie adapter are both powered externally, so I don’t think it’s a power consumption issue.

Actually I don’t need to run the cams in parallel. It would be sufficient to connect to a cam, record an image, and disconnect one after another.

Is there something I misunderstood about the standards? Is my hardware not capable of super speed USB traffic? Is it just one more boot parameter I need to set for the system to work?

And idea world be highly appreciated. I hope I included the relevant info, otherwise, please just ask.

Thanks,
Alex

Adding ‘pcie=noear’ is only suppressing errors. So, it is better to understand why those errors are coming. Based on the description, I assume that all of them are ‘Corrected’ errors. These errors could come if the electricals of the link are not that great (probably because of the multiple adapters you are using??)
In any case, could you please give me the output of ‘sudo lspci -vvvv’ (of course you can give this after adding ‘pcie=noaer’)
I’m wondering if ASPM is playing any role here.
Along with ‘pcie=noaer’, could you please add ‘pcie_aspm=off’ also and update your observations?

Hi vidyas

Thank you for your help.
I added the outputs of lspci -vvvv before (only pci=noaer) and after adding pcie_aspm=off
That actually changed the behaviour. Now when the camera starts acquisition, after some frames, the system does not freeze, but the camera connection dies. The usb device and the hub do not show up any more.
dmesg prints the following (attached the dmesg log as well):
[ 106.198522] tegra-pcie 1003000.pcie: unexpected MSI
[ 115.317484] xhci_hcd 0000:01:00.0: xHCI host not responding to stop endpoint command.
[ 115.325306] xhci_hcd 0000:01:00.0: Assuming host is dying, halting host.
[ 115.335409] xhci_hcd 0000:01:00.0: HC died; cleaning up
[ 115.337420] usb 2-2.1: usbfs: usb_submit_urb returned -22

Trying to be smart, I added pci=nomsi to the boot config. That resulted in the old freezing behaviour. I attached the lspci -vvvv output of the last setup as well.

Best regards,
Alex

aspmOff.log (2.9 KB) dmesg.log (3.3 KB) msiOff.log (2.9 KB) noaer.log (2.9 KB)

Please capture the output of ‘lspci -vvvv’ with ‘sudo’, otherwise we will miss out on the important data. Based on the data so far, I think not having ‘pcie_aspm=off’ is enabling ASPM states which are resulting in the AER errors. So, it is better to keep ASPM disabled through the above kernel command line option.
So, do you observe this ‘unexpected MSI’ print only when all cameras start acquisition, or is this observed even with one camera starting the acquisition?

Could you please share the full log as well? also the output of ‘cat /proc/interrupts’ if possible.

Also, please try with the below patch once and update the thread with full dmesg log.

diff --git a/drivers/pci/host/pci-tegra.c b/drivers/pci/host/pci-tegra.c
index 63c0e343d388..4dc640e795ea 100644
--- a/drivers/pci/host/pci-tegra.c
+++ b/drivers/pci/host/pci-tegra.c
@@ -2956,7 +2956,7 @@ static irqreturn_t tegra_pcie_msi_irq(int irq, void *data)
                                 * that's weird who triggered this?
                                 * just clear it
                                 */
-                               dev_info(pcie->dev, "unexpected MSI\n");
+                               dev_info(pcie->dev, "unexpected MSI, index = %u, irq = %d\n", index, irq);
                        }

                        /* see if there's any more pending in this vector */
@@ -2983,10 +2983,11 @@ static int tegra_msi_setup_irq(struct msi_controller *chip, struct pci_dev *pdev
        if (hwirq < 0)
                return hwirq;

+       pr_info("---> (%d : %s) hwirq = %d\n",__LINE__,__func__, hwirq);
        irq = irq_create_mapping(msi->domain, hwirq);
        if (!irq)
                return -EINVAL;
-
+       pr_info("---> (%d : %s) hwirq = %d, irq = %u\n",__LINE__,__func__, hwirq, irq);
        irq_set_msi_desc(irq, desc);

        msg.address_lo = lower_32_bits(msi->phys);
@@ -3008,6 +3009,7 @@ static void tegra_msi_teardown_irq(struct msi_controller *chip, unsigned int irq
        struct irq_data *d = irq_get_irq_data(irq);

        PR_FUNC_LINE;
+       pr_info("---> (%d : %s) hwirq = %lu, irq = %u\n",__LINE__,__func__, d->hwirq, irq);
        tegra_msi_free(msi, d->hwirq);
 }

I didn’t install the patch, as I did not yet compiele a kernel myself and have to figure that out first.

The error is observed with only one camera connected. The camera, and seemingly the whole PCIe USB card disconnects. However I can see that the camera is still provided power. If I detach and reattach the camera, there is no power provided any more and none of the devices reconnects to the jetson.

I attached the requested outputs from dmesg, lspci and interrupts.

Thank you very much!

dmesg2.log (62.6 KB) interrupts.log (8.6 KB) lspciSudo.log (14.0 KB)

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

Thanks for the logs. It looks like the device is misbehaving, but I need the logs after applying the patch. I’ll wait till that is available.