PCIE bus error (console information log)

frank_weng · May 2, 2025, 9:10am

The console shows the following pcie bus error log
What is the impact on PCIE devices?

[ 102.779116] pcieport 0001:00:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[ 102.779118] pcieport 0001:00:00.0: device [10de:229e] error status/mask=00001000/0000e000
[ 102.779121] pcieport 0001:00:00.0: [12] Timeout
[ 102.779641] pcieport 0001:00:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[ 102.779643] pcieport 0001:00:00.0: device [10de:229e] error status/mask=00001000/0000e000
[ 102.779645] pcieport 0001:00:00.0: [12] Timeout
[ 102.780284] pcieport 0001:00:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[ 102.780285] pcieport 0001:00:00.0: device [10de:229e] error status/mask=00001000/0000e000
[ 102.780288] pcieport 0001:00:00.0: [12] Timeout
[ 102.786086] pcieport 0001:00:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[ 102.786088] pcieport 0001:00:00.0: device [10de:229e] error status/mask=00001000/0000e000
[ 102.786091] pcieport 0001:00:00.0: [12] Timeout
[ 102.934002] pcieport 0001:00:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[ 102.934005] pcieport 0001:00:00.0: device [10de:229e] error status/mask=00001000/0000e000
[ 102.934008] pcieport 0001:00:00.0: [12] Timeout
[ 145.776864] pcieport 0001:00:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[ 145.776868] pcieport 0001:00:00.0: device [10de:229e] error status/mask=00001000/0000e000
[ 145.776872] pcieport 0001:00:00.0: [12] Timeout
[ 145.786781] pcieport 0001:00:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[ 145.786784] pcieport 0001:00:00.0: device [10de:229e] error status/mask=00001000/0000e000
[ 145.786787] pcieport 0001:00:00.0: [12] Timeout
[ 145.874489] pcieport 0001:00:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[ 145.874492] pcieport 0001:00:00.0: device [10de:229e] error status/mask=00001000/0000e000
[ 145.874495] pcieport 0001:00:00.0: [12] Timeout
[ 209.767260] pcieport 0001:00:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[ 209.767263] pcieport 0001:00:00.0: device [10de:229e] error status/mask=00001000/0000e000
[ 209.767267] pcieport 0001:00:00.0: [12] Timeout
[ 209.816840] pcieport 0001:00:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[ 209.816843] pcieport 0001:00:00.0: device [10de:229e] error status/mask=00001000/0000e000
[ 209.816846] pcieport 0001:00:00.0: [12] Timeout
[ 304.786046] pcieport 0001:00:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
[ 304.786051] pcieport 0001:00:00.0: device [10de:229e] error status/mask=00001000/0000e000

linuxdev · May 2, 2025, 8:14pm

In theory the problem is corrected, but it isn’t really possible to know based on what is presented.

For background, many PCIe devices have an optional “Advanced Error Correction”, and this allows not only detecting various error types, but often fixing those errors. The nature of the error depends on the specific error, and although often the problem is one of signal quality, it also is not unusual for this to be related to a software issue, e.g., a mismatched driver or argument passed to the driver.

Note that the particular PCIe device itself defines much of this. Yours is apparently at slot 0001:00:00.0. Normally one would use lspci to find out more information. Some information on this:

lspci is a brief view of all known PCIe devices. Jetsons don’t have a lot, but it might list PCIe bridges for example in addition to the device itself.
One has to use sudo to find the most verbose format of lspci.
To view only the slot your error message is about, and to simultaneously create a log file you can attach to the forum:
sudo lspci -s '0001:00:00.0' -vvv 2>&1 | tee log_lspci.txt

With that you could see verbose information about the specific device, and then attach a copy to the forum. More information would probably be available then.

We would also need to know the exact model of Jetson. This includes whether there is a custom or third party carrier board involved, or if this is purely a developer’s kit. I suggest adding this information:

cat /etc/nv_boot_control.conf
head -n 1 /etc/nv_tegra_release
Have there been any device tree modifications, and if so what?
If this is a PCIe device you installed, add details what the device is; if not, then specify you don’t have any optional PCIe hardware (including m.2 slot).

frank_weng · May 6, 2025, 2:39am

Dear Linuxdev

1: Jeton Orin Nano
2: The company developed its own board based on the Jetson Orin Nano line

3: nv_boot_control.conf`
TNSPEC 3767-300-0003-P.1-1-1-jetson-orin-nano-devkit-
COMPATIBLE_SPEC 3767–0003–1–jetson-orin-nano-devkit-
TEGRA_BOOT_STORAGE nvme0n1
TEGRA_CHIPID 0x23
TEGRA_OTA_BOOT_DEVICE /dev/mtdblock0
TEGRA_OTA_GPT_DEVICE /dev/mtdblock0

4:head -n 1 /etc/nv_tegra_release
# R36 (release), REVISION: 4.3, GCID: 38968081, BOARD: generic, EABI: aarch64, DATE: Wed Jan 8 01:49:37 UTC 2025

5: There is no change to the PCIE settings. Only io_expansion is added to control peripheral power.

6:PCIE message
0001:00:00.0 PCI bridge: NVIDIA Corporation Device 229e (rev a1)
0001:01:00.0 Network controller: Realtek Semiconductor Co., Ltd. Device c852 (rev 01)
0004:00:00.0 PCI bridge: NVIDIA Corporation Device 229c (rev a1)
0004:01:00.0 Non-Volatile memory controller: ADATA Technology Co., Ltd. Device 2269 (rev 03)
0008:00:00.0 PCI bridge: NVIDIA Corporation Device 229c (rev a1)
0008:01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)

sudo lspci -s ‘0001:00:00.0’ -vvv 2>&1 | tee log_lspci.txt
log_lspci_s.txt (5.1 KB)

Since the occurrence is very random, there is no pcie problem now. We have to wait for it to happen before capturing the relevant logs.

linuxdev · May 6, 2025, 7:05pm

The information you’ve added is good to have:

You’ve booted to an external device (nvme0n1p1) using a mainline kernel (L4T R36.x uses mainline).
The carrier board is flashed as a dev kit.
- Can you verify that the hardware itself is in fact truly a developer’s kit (probably it is, but this needs to be asked)? Sometimes third party carrier boards end up on an NVIDIA carrier board, which works, but it is important to know that the carrier board is in fact what the software is designed to work with during debugging.
The device at slot ‘0001:00:00.0’ is part of NVIDIA’s devices. It is a PCIe bridge. This means that the device and the device attached to the bridge need to be considered together. For that it would be useful to have a tree view of lspci:
- lspci -tv
- With logging for a file you can attach to the forum:
  lspci -tv 2>&1 | tee log_pci_tree.txt

You can provide the tree view of lspci now, but we will need the verbose lspci on that specific slot after some errors have occurred. Assuming the dmesg logs show the same PCIe slot (the bridge) of ‘0001:00:00.0’, then whenever you find the next error:
sudo lspci -s ‘0001:00:00.0’ -vvv 2>&1 | tee log_pcie_error.txt
(then attach log_pcie_error.txt)

If you post the tree view of lspci now, then we can figure out what slots the bridge might be serving. When the error occurs on the bridge it is possible that we might be interested in knowing what device that bridge serves and getting a verbose lspci on the device being served even if that device is not itself showing an error. PCIe devices do often have sub-devices though, and so the tree view slot naming might need an explanation when describing what the slot is that the bridge serves. We can get that knowledge out ahead of time and then see if there are downstream errors as well as bridge errors.

If it turns out that the device being served by the bridge is the NVMe, then we might ask more questions about the NVMe, but don’t bother for now.

system · May 21, 2025, 5:46am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Orin nx: pcie bus error Jetson Orin NX pcie	5	119	November 7, 2025
Pcieport error Jetson Orin Nano pcie	5	110	November 17, 2025
PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0010(Receiver ID) Jetson Orin NX pcie	17	6934	May 16, 2024
PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0010(Receiver ID) Jetson Nano pcie	16	7240	May 2, 2023
PCIe driver error on Orin PCIe CEM interface Jetson AGX Orin pcie	2	1147	April 19, 2023
PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0010(Receiver ID) Jetson Orin NX pcie , board-design	4	405	June 5, 2024
"PCIe Bus Error: severity=Corrected" on Jetson Nano Jetson Nano pcie	3	11832	October 18, 2021
PCIe Bridge Device Initialization Issue for Jetson Nano Jetson Nano pcie , board-design	18	2176	January 18, 2023
L4T 35.5.0: Crash in UEFI when adding a PCIe device Jetson Orin Nano pcie , board-design	5	432	May 15, 2024
PCIE export error Jetson TX1	4	2846	July 5, 2018

PCIE bus error (console information log)

Related topics