Jetson Agx Orin 32G使用PCIE C1失败

请教下我们自研的主板在PCIE C1上挂了一个NVME,在UEFI下看到识别到了挂载的NVME盘,但是在系统下没有找到NVME的盘,尝试过把我们烧录系统的模块替换到你们的套件上,在系统下面看到pcie14100000有link up,系统下面有wifi,我们硬件确认跟你们套件上PCIE C1的线路一致,所以现在想请教下我们PCIE C1上想用NVME是否有什么特殊的配置?

这是在UEFI下识别到的NVME盘

系统下没有报错14100000.pcie: Phy link never came up

*** Please note that this reply is generated by LLM automatically ***

The error message you’re seeing suggests that there’s an issue with the PCIe link on your Jetson AGX Orin module. Specifically, it’s complaining about the “Phy link never came up” on the PCIe interface.

Here are a few potential causes and solutions to this issue:

  1. Hardware issue: It’s possible that there’s a hardware problem with the PCIe interface on your Jetson module or the device you’re trying to connect to it. Try swapping out the device or the cable to see if the issue persists.

  2. Firmware issue: The error message mentions “UEFINVME” which suggests that the issue might be related to the UEFI firmware on your Jetson module. Try updating the firmware to the latest version to see if it resolves the issue.

  3. Software issue: It’s possible that there’s a software issue that’s preventing the PCIe link from coming up. Try rebooting your Jetson module or reloading the PCIe driver to see if it resolves the issue.

  4. Configuration issue: It’s possible that there’s a configuration issue that’s preventing the PCIe link from coming up. Try checking the PCIe configuration on your Jetson module to make sure it’s set up correctly.

In terms of comparing this issue to other potential issues, here are a few things to keep in mind:

  • PCIe link speed: If the PCIe link is not coming up, it’s possible that the link speed is not set correctly. Try checking the PCIe link speed on your Jetson module to make sure it’s set to a valid value.
  • PCIe lane configuration: If the PCIe link is not coming up, it’s possible that the PCIe lane configuration is not set correctly. Try checking the PCIe lane configuration on your Jetson module to make sure it’s set to a valid value.
  • Device tree configuration: If the PCIe link is not coming up, it’s possible that the device tree configuration is not set correctly. Try checking the device tree configuration on your Jetson module to make sure it’s set to a valid value.

Here are some specific commands you can try to troubleshoot the issue:

  • lspci -v to check the PCIe configuration on your Jetson module
  • dmesg | grep PCIe to check the kernel logs for any PCIe-related errors
  • sudo jetson-io to check the device tree configuration on your Jetson module

I hope this helps! Let me know if you have any other questions.

*** Please give the thumbs up if you get answers you like. Or provide feedback to help us improve the answer. ***

Nothing to configure. Default software already configure everything you need.

If custom board is not able to detect it, please check hardware signal and design.

您好,在线指导文档有看到C1用的是M.2 Key E,我们板子上PCIE C1用的是M.2 Key M这个会有影响吗?

No, it does not affect software configuration.

Please follow up the debug tips here

我们现在疑惑为什么在UEFI下能识别到NVME盘

我们在将C1的速度降到GEN 2后lspci能找到两个NVME的控制器,但是格式化NVME盘后,现在出现了几种不稳定的情况

1、lspci能找到两个NVME控制器,能找到两个盘

2、lspci能找到两个NVME控制器,但是只能找到一个盘(C1和C4都有可能)

3、lspci只能找到C4的控制器

4、重启时UEFI会报断言的错误 assert [nvmexpressdxe] /out/nvidia/bootloader/uefi/jetson_release/edk2/mdemodulepkg/bus/pci/nvmexpressdxe/nvmexpresshci.c(772): (private->cap.mpsmin + 12) <= 12

麻烦帮忙解答一下

2的狀況聽起來好像跟tegra pcie本身無關, 請問有dmesg能提供嗎?

dmesg_log51104_144918.txt (94.9 KB)

請問在這情況下的sudo lspci -vvv的結果跟偵測到時候的能分享一下嗎

log.txt (109.6 KB)这是第2种情况的日志

1.txt (117.8 KB)这是第1种情况的日志

看來出問題的時候link detection還是有點問題. 請問有pcie analyzer幫忙抓trace嗎?
有其他ssd可以測試嗎?

更换了我们之前在agx orin模组上测试稳定的盘,现在C4的盘能固定找到,但是C1的盘还是会有概率找不到,基本上跟我上面描述的情况一致

请问之前有碰到类似的案例吗?再次跟您确认下,C1上安装NVME是否需要修改pcie部分的驱动?

我们现在不方便测量信号质量,请问下还有什么排查思路吗?麻烦回复下,非常感谢