JETSON AGX Thor Automatic System Reboot Post Boot-Up

Dear NVIDIA Technical Support,

We are experiencing an issue with the Tegra module where the system automatically reboots.

The system version is L4T 38.2. After the device boots into the system, it automatically restarts even without any manual operations.

We kindly request your technical assistance. The relevant logs have been attached to this email.

We look forward to your support and reply.

[com COM7] (2026-01-05_104220) COM7 (Silicon Labs CP210x USB to UART Bridge (COM7)).log (330.2 KB)

Hi fz11,

Are you using the devkit or custom board for Thor?

[  744.354386] rcu: INFO: rcu_sched self-detected stall on CPU
[  744.354391] rcu: 0-....: (5250 ticks this GP) idle=6f04/1/0x4000000000000000 softirq=17029/17029 fqs=1778
[  744.354395] rcu: (t=5251 jiffies g=24785 q=18426 ncpus=14)
[  807.370387] rcu: INFO: rcu_sched self-detected stall on CPU
[  807.370389] rcu: 0-....: (21003 ticks this GP) idle=6f04/1/0x4000000000000000 softirq=17029/17029 fqs=7793
[  807.370392] rcu: (t=21005 jiffies g=24785 q=26302 ncpus=14)

I saw above errors in the log you shared, and it seems you’ve enabled rt-kernel(i.e. 6.8.12-rt-tegra).

Could you help to clarify the following:

  1. Is the issue specific to the RT kernel?
  2. Is a display connected when the issue occurs?

Hello KevinFFF,
We are all together. Let me answer this question.

Use custom board for Thor

Only the real-time kernel has been compiled, but it has not been enabled in the startup file yet.

It was then that the problem was reproduced on the monitor. It simultaneously monitored both the serial console and the display screen.

Relevant system information:

(base) nvidia@nvidia:~$ uname -a
Linux nvidia 6.8.12-rt-tegra #6 SMP Fri Dec  5 15:16:12 CST 2025 aarch64 aarch64 aarch64 GNU/Linux
(base) nvidia@nvidia:~$ cat /etc/nv_tegra_release
# R38 (release), REVISION: 2.0, GCID: 41844464, BOARD: generic, EABI: aarch64, DATE: Fri Aug 22 00:55:42 UTC 2025
# KERNEL_VARIANT: oot
TARGET_USERSPACE_LIB_DIR=nvidia
TARGET_USERSPACE_LIB_DIR_PATH=usr/lib/aarch64-linux-gnu/nvidia
INSTALL_TYPE=openrm
(base) nvidia@nvidia:~$ cat /boot/extlinux/extlinux.conf
TIMEOUT 30
DEFAULT primary

MENU TITLE L4T boot options

LABEL primary
      MENU LABEL primary kernel
      LINUX /boot/plink/Image
      INITRD /boot/initrd
      FDT /boot/plink/y-c28-agx-thor-382.dtb
      APPEND ${cbootargs} root=PARTUUID=d88290f2-1de5-401d-905c-e6a7d6aebf3d rw rootwait rootfstype=ext4 mminit_loglevel=4 earlycon=tegra_utc,mmio32,0xc5a0000 console=ttyUTC0,115200 clk_ignore_unused firmware_class.path=/etc/firmware fbcon=map:0 efi=runtime

# When testing a custom kernel, it is recommended that you create a backup of
# the original kernel and add a new entry to this file so that the device can
# fallback to the original kernel. To do this:
#
# 1, Make a backup of the original kernel
#      sudo cp /boot/Image /boot/Image.backup
#
# 2, Copy your custom kernel into /boot/Image
#
# 3, Uncomment below menu setting lines for the original kernel
#
# 4, Reboot

# LABEL backup
#    MENU LABEL backup kernel
#    LINUX /boot/Image.backup
#    INITRD /boot/initrd
#    APPEND ${cbootargs}

(base) nvidia@nvidia:~$

Just a note about something interesting, not a cause. There is some log output like this, from within the bootloader stages:

[itia0002lize.302d ma] I>il_e Tasarlyk: L
inoad itiaand lizeauthd fuentise
cateinit regialiistezed red vfreFWs
l
i
[00niti02.3aliz18] ed aI> Pdc
artifmontion_pop namulate: Be_mo_aonnito-fw
rs: 
[00foun02.3d 2328] 7 moI> Snitoize rs
of pinitartiialitionzed : 26fmon2144
in
[0itia002.lize339]d mc I> 
inBinaitiary@ lizedevid rece:3set
/0 b
inilocktial-983ized04 ( uphparty_eaitiorly
n si
inize: tial0x40ized000) emc, na_earme: ly

What I see are two processes are both using the same UART or output at the same time. I wonder if a device tree added something to be able to log during boot stages, but something else is already (still) there. It would be a mistake for two processes to write simultaneously to the UART (or at least highly suspicious of a configuration issue). Once it is in later stages this goes away, but I could see an error in environment setup causing problems.

Don’t know if this means much, but it is worth putting it in for the purpose of summary regarding the kernel boot command line (and indirectly perhaps the device tree):

[    0.000000] Kernel command line: root=PARTUUID=d88290f2-1de5-401d-905c-e6a7d6aebf3d rw rootwait rootfstype=ext4 mminit_loglevel=4 earlycon=tegra_utc,mmio32,0xc5a0000 console=ttyUTC0,115200 clk_ignore_unused firmware_class.path=/etc/firmware fbcon=map:0 efi=runtime bl_prof_dataptr=6225920@0x2008010000 bl_prof_ro_ptr=65536@0x2008000000 
[    0.000000] Unknown kernel command line parameters "bl_prof_dataptr=6225920@0x2008010000 bl_prof_ro_ptr=65536@0x2008000000", will be passed to user space.

This occurs later on, after the kernel is running (this is no longer part of a boot stage):

[    2.401973] tegra264-pmc c800000.pmc: there is not valid maps for state default
[    2.404957] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[    2.410209] msm_serial[   : dr 2.4iver2216 ini6] ptialrintizedk: l
[ egac   2y co.413nsol320]e [t SuptyUTerH C0] (H)SenabCI(Fled
) dr
[  iver  2. ini4348tial99] izedprin
[ tk:    2lega.417cy b805]ootc STMonso32 Ule [SARTtegr dria_utver c0] initdisaialibledzed

[ [      2 2.4.4522216165]6] p armrint-smmk: lu-v3egac 810y co5000nsol000.e [tiommtyUTu: iC0] as 4enab8-biled
t, o[   as 4 2.48-bi3489t (f9] peaturintres k: l0x00egac0e1fy boaf)
otco
[  nsol  2.e [t4527egra85] _utcarm-0] dsmmuisab-v3 led
8105000000.iommu: allocated 65536 entries for cmdq

Looks like serial console is again seeing two independent processes with output to the same UART at the same time. In the end:
Error reading from serial device


You should examine this output:

# UARTs available to user:
ls -l /dev/tty* | egrep  ' (dialout) .+[/]tty[a-zA-Z]+[0-9]+'

Is every serial UART you are using visible there? Check again with:

# UARTs unavailable due to being serial console:
ls -l /dev/tty* | egrep  ' (tty) .+[/]tty[a-zA-Z]+[0-9]+'

In the latter there should be one device listed (it may be the same UART has two names though, depending on driver)? Does your custom board layout (and thus device tree) correctly match the device names for those UARTs reserved for your use, versus those reserved for serial console?

It seems you have compiled and applied the rt-kernel.

Could you help to clarify if there’s the similar issue if you don’t connect the monitor?

In fact, for the device tree configuration of the UART, we are using the default configuration provided by Thor. We simply enabled the corresponding device tree node for the port we are using.

                /*
                 * serial10 = "/bus@0/serial@810c540000"        UART2
                 * serial5 = "/bus@0/serial@810c510000";        UART5
                 * serial8 = "/bus@0/serial@a808800000";
                 * serial4 = "/bus@0/serial@810c500000";
                 * serial0 = "/bus@0/serial@c5f0000";           UART3(DEBUG)
                 * serial9 = "/bus@0/serial@810c530000";        UART1
                 */
        serial@810c530000 {     /* UART1 -> SOC_UART9 */
            status = "okay";
        };

        serial@810c540000 {     /* UART2 -> SOC_UART10 */
            status = "okay";
        };

        serial@810c510000 {     /* UART5 -> SOC_UART5 */
            status = "okay";
        };

        serial@810c500000 {
            status = "disabled";
        };

We collected the logs because this issue occurred when the customer was using it in the field. Therefore, we reproduced it by connecting to the display. This is in line with the customer’s field usage environment. In fact, during my ssh debugging process, I also experienced a restart phenomenon, but I didn’t monitor the serial console logs.

Because during the debugging process, it’s also possible that the device restarted due to high temperature. This has happened before. We can arrange a test to see if this problem occurs when we don’t involve the monitor.

Hi chen.xi,

For the original rcu stall issue, could you refer to R38.4.0 实时内核在接显示器之后 上下电会卡住 - #6 by DaneLLL and apply the patch for rt-kernel to check if it could help for your case?