JP38.2.1 system also may stall occasionally

Hi,Nvidia team:
The system may be stalled occasionally in JP38.2.1. We test it in Nvidia Thor DevKit.
There is no reaction in GUI when we type in keyboard and there are some trace in serial port.
It seems like dead lock happend in kernel.

We can see the Xorg ‘s CPU occupation may reach 100% by serial port.
Attachment is the whole log.

Could you show me what is wrong?

BR//

log.txt (12.3 KB)

Just to clarify. Is this issue only happened with RT kernel enabled?

It also happened in non-RT kernel.

please share the log with non-RT kernel too. Thanks.

Hi, Wayne:
I find almost all stall happen in “tegra_dce” module, and call trace like follow:
Shall I disable display to walk around this issue?

2025-10-13T03:59:37.153396+07:00 car kernel: rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 0-… } 5406 jiffies s: 853 root: 0x1/.
2025-10-13T03:59:37.153401+07:00 car kernel: rcu: blocking rcu_node structures (internal RCU debug):
2025-10-13T03:59:37.153403+07:00 car kernel: Sending NMI from CPU 12 to CPUs 0:
2025-10-13T03:59:37.153404+07:00 car kernel: NMI backtrace for cpu 0
2025-10-13T03:59:37.164984+07:00 car kernel: CPU: 0 PID: 1896 Comm: Xorg Tainted: G W OE 6.8.12-rt-tegra #1
2025-10-13T03:59:37.164989+07:00 car kernel: Hardware name: NVIDIA NVIDIA Jetson AGX Thor Developer Kit/Jetson, BIOS r38.2-899cdbc9-dirty 09/28/2025
2025-10-13T03:59:37.164992+07:00 car kernel: pstate: 23400009 (nzCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=–)
2025-10-13T03:59:37.164994+07:00 car kernel: pc : format_decode+0x58/0x598
2025-10-13T03:59:37.164996+07:00 car kernel: lr : vsnprintf+0x7c/0x6e0
2025-10-13T03:59:37.164998+07:00 car kernel: sp : ffff80008ddc3180
2025-10-13T03:59:37.165000+07:00 car kernel: x29: ffff80008ddc3180 x28: ffffda9232265642 x27: 00000000fffffff0
2025-10-13T03:59:37.165002+07:00 car kernel: x26: ffff80008ddc3320 x25: 0000000000000064 x24: ffff80008ddc3264
2025-10-13T03:59:37.165003+07:00 car kernel: x23: ffff80008ddc3320 x22: ffff80008ddc32c8 x21: 0000000000000002
2025-10-13T03:59:37.165005+07:00 car kernel: x20: ffffda9232265642 x19: ffff80008ddc31b0 x18: ffffffffffffffff
2025-10-13T03:59:37.165007+07:00 car kernel: x17: 0000000000000000 x16: ffffda92ac0352f0 x15: ffff80008ddc3130
2025-10-13T03:59:37.165008+07:00 car kernel: x14: ffff80008ddc32c8 x13: ffff80008ddc326c x12: 72203a6465747075
2025-10-13T03:59:37.165010+07:00 car kernel: x11: ffff0000acc23780 x10: ea90c156a2baafd9 x9 : e7f282820261669e
2025-10-13T03:59:37.165012+07:00 car kernel: x8 : ffff80008ddc32c8 x7 : 0000000000000001 x6 : 5b20746e65696c43
2025-10-13T03:59:37.165013+07:00 car kernel: x5 : 0000000000000030 x4 : ffffda9232265645 x3 : ffffda9232265645
2025-10-13T03:59:37.165015+07:00 car kernel: x2 : 0000000000000061 x1 : ffff80008ddc31b0 x0 : 0000000000000001
2025-10-13T03:59:37.165016+07:00 car kernel: Call trace:
2025-10-13T03:59:37.165018+07:00 car kernel: format_decode+0x58/0x598
2025-10-13T03:59:37.165020+07:00 car kernel: vsnprintf+0x7c/0x6e0
2025-10-13T03:59:37.165021+07:00 car kernel: dce_os_log_msg+0x8c/0x124 [tegra_dce]
2025-10-13T03:59:37.165023+07:00 car kernel: dce_client_ipc_wait+0xc0/0x188 [tegra_dce]
2025-10-13T03:59:37.165025+07:00 car kernel: dce_ipc_send_message_sync+0x90/0x288 [tegra_dce]
2025-10-13T03:59:37.165027+07:00 car kernel: tegra_dce_client_ipc_send_recv+0x94/0x1d0 [tegra_dce]
2025-10-13T03:59:37.165028+07:00 car kernel: nv_tegra_dce_client_ipc_send_recv+0x38/0x64 [nvidia]
2025-10-13T03:59:37.165030+07:00 car kernel: dceclientSendRpc_IMPL+0x64/0xe0 [nvidia]
2025-10-13T03:59:37.165032+07:00 car kernel: _dceRpcIssueAndWait.isra.0+0x80/0x100 [nvidia]
2025-10-13T03:59:37.165033+07:00 car kernel: rpcRmApiControl_dce+0xc8/0x1b0 [nvidia]
2025-10-13T03:59:37.165035+07:00 car kernel: rmresControl_Prologue_IMPL+0xb4/0x1c0 [nvidia]
2025-10-13T03:59:37.165037+07:00 car kernel: resControl_IMPL+0xec/0x1d0 [nvidia]
2025-10-13T03:59:37.165039+07:00 car kernel: serverControl+0x3b8/0x4a0 [nvidia]
2025-10-13T03:59:37.165040+07:00 car kernel: _rmapiRmControl+0x474/0x6a0 [nvidia]
2025-10-13T03:59:37.165042+07:00 car kernel: rmapiControlWithSecInfo+0xa8/0x150 [nvidia]
2025-10-13T03:59:37.165043+07:00 car kernel: rmapiControlWithSecInfoTls+0x74/0xe0 [nvidia]
2025-10-13T03:59:37.165045+07:00 car kernel: _nv04ControlWithSecInfo.constprop.0+0x80/0xa0 [nvidia]
2025-10-13T03:59:37.165046+07:00 car kernel: Nv04ControlKernel+0x50/0x60 [nvidia]
2025-10-13T03:59:37.165047+07:00 car kernel: nvkms_call_rm+0x58/0x94 [nvidia_modeset]
2025-10-13T03:59:37.165049+07:00 car kernel: nvRmApiControl+0x50/0x70 [nvidia_modeset]
2025-10-13T03:59:37.165051+07:00 car kernel: __arm64_sys_ioctl+0xac/0xf0
2025-10-13T03:59:37.165052+07:00 car kernel: invoke_syscall+0x48/0x114
2025-10-13T03:59:37.165054+07:00 car kernel: el0_svc_common.constprop.0+0xc0/0xe0
2025-10-13T03:59:37.165055+07:00 car kernel: do_el0_svc+0x1c/0x28
2025-10-13T03:59:37.165057+07:00 car kernel: el0_svc+0x30/0xa8
2025-10-13T03:59:37.165058+07:00 car kernel: el0t_64_sync_handler+0x120/0x12c
2025-10-13T03:59:37.165060+07:00 car kernel: el0t_64_sync+0x194/0x198

Please follow our previous suggestion to identify if your issue could reproduce with non RT kernel first.