JetPack 5.0.2 with PCIE Hub and NVME doesn't work

Hello together,

after upgrading from JetPack 4.7.3 to JetPack 5.0.2 my NVME in a custom doesn’t work anymore. I get the following output in dmesg.

[  475.963749] nvme nvme0: pci function 0005:06:00.0
[  537.573600] nvme nvme0: I/O 0 QID 0 timeout, disable controller
[  537.681361] nvme nvme0: Device shutdown incomplete; abort shutdown
[  537.701906] nvme nvme0: Identify Controller failed (-4)
[  537.702134] nvme nvme0: Removing after probe failure status: -5

I’ve tried following combinations:

  • Same SOM+Kernel with a custom board without a PCIE Hub, so that the NVME is directly connected. → works
  • Same custom Board with PCIE Hub and with JetPack 4.7.3 → works

So i think it’s a timing probing issue within the nvme-pci driver.

The reason why we’re switching from 4.7.3 to 5.0.2 is because of the CBoot, since it also had a problem which was similar to this: Boot Jetson xavier NX using M.2 Key-M SSD

What also is interesting in the UEFI i see the correct NVME as a boot option.

Attached is the dmesg output & the PCIE hub logic in the actual board.


bootlog_pci_hub_nvme_failed_jetpack_5_0_2.log (80.9 KB)

Thanks alot and best regards,
Saber

Is lspci able to detect the device?

lspci is detecting the device, but not the nvme driver. If i manually bind the nvme driver to the device it is the similar output like above (60s timeout).

@WayneWWW any ideas so far? I am still stuck in this problem.

Hi,

Do we have other nvme that can test here?

Hi,

yes, we’ve tried 3 different nvme’s from different vendors with different sizes.

  • Kingston 2000G
  • Transcent TS256GMTE110S
  • SiliconPower A60

Does every of them get nvme driver timeout error ?

Try to disable ASPM and see if it could work.

Yes, i get everytime the same error in all 3 cases. Also I’ve tried deactivated ASPM already with the kernel line append.

please connect any one of the device that can reproduce issue and share lspci -vvv result.

Hi Wayne, attached the detailed lspci output of all the nvme’s. Note: pcie_aspm=off in extlinux is active

lspci_output_siliconMotion.txt (3.6 KB)
lspci_output_kingston.txt (3.6 KB)
lspci_output_transcend.txt (3.5 KB)

And as reference maybe the Jetpack 4 Output in the same board:

lspci_output_siliconMotion_Jetpack_4.txt (3.2 KB)

@WayneWWW there are some interesting news from this topic. The UEFI has the possibility to load the Kernel from NVME directly (so extlinux.conf), it hangs when it tries to mount the rootfs from NVME. So I think that the nvme or pci drivers from JP5 aren’t working with the PCIE Hub. Attached the overview picture.

Also attached the output with UEFI SSD Boot:
ssd_boot_pcie_hub.txt (74.1 KB)

@WayneWWW do you have any idea’s about this topic, I’ve been stuck on this issue for weeks, unfortunately.

Thanks alot and best regards,
Saber

We are also struggling with the same issue, after jetpack upgrade 4.6.2 → 5.0.2
I get these errors in the kern.log I have tried pcie_aspm=off
[ 66.528788] nvme nvme0: I/O 4 QID 0 timeout, disable controller
[ 66.531231] nvme nvme0: Identify Controller failed (-4)
[ 66.533274] nvme nvme0: Removing after probe failure status: -5

Please move to use the latest JetPack 5.1.1 release.

Hi @kayccc i’ve tested the latest release with the same error:

[   66.529819] nvme nvme0: I/O 24 QID 0 timeout, disable controller
[   66.637745] nvme nvme0: Device shutdown incomplete; abort shutdown
[   66.639376] nvme nvme0: Identify Controller failed (-4)
[   66.640634] nvme nvme0: Removing after probe failure status: -5