AGX Xavier reboot with "NMI watchdog: BUG: soft lockup..."

My AGX Xavier development kit kept self rebooting suddently.
Please help , thanks!

It used jetson R32.4.4 and custom rootfs on ubuntu 18.04.5.

The serial output:

...

[ 2540.440740] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [ksoftirqd/0:3]
[ 2540.441074] Kernel panic - not syncing: softlockup: hung tasks
[ 2540.441172] CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: G             L  4.9.140-tegra #1
[ 2540.441297] Hardware name: Jetson-AGX (DT)
[ 2540.441368] Call trace:
[ 2540.441416] [<ffffff800808bdb8>] dump_backtrace+0x0/0x198
[ 2540.441507] [<ffffff800808c37c>] show_stack+0x24/0x30
[ 2540.441592] [<ffffff800845c7a0>] dump_stack+0x98/0xc0
[ 2540.441680] [<ffffff80081c1438>] panic+0x11c/0x298
[ 2540.441765] [<ffffff8008181760>] watchdog_unpark_threads+0x0/0x98
[ 2540.441869] [<ffffff80081399e0>] __hrtimer_run_queues+0xd8/0x360
[ 2540.441966] [<ffffff800813a330>] hrtimer_interrupt+0xa8/0x1e0
[ 2540.442063] [<ffffff8008bffe98>] arch_timer_handler_phys+0x38/0x58
[ 2540.442337] [<ffffff8008126f10>] handle_percpu_devid_irq+0x90/0x2b0
[ 2540.442797] [<ffffff80081214f4>] generic_handle_irq+0x34/0x50
[ 2540.443252] [<ffffff8008121bd8>] __handle_domain_irq+0x68/0xc0
[ 2540.443711] [<ffffff8008080d44>] gic_handle_irq+0x5c/0xb0
[ 2540.445595] [<ffffff8008082c28>] el1_irq+0xe8/0x194
[ 2540.450495] [<ffffff8008daa274>] __netif_receive_skb_core+0xa0c/0xad8
[ 2540.456882] [<ffffff8008dad010>] __netif_receive_skb+0x28/0x78
[ 2540.462743] [<ffffff8008dad08c>] netif_receive_skb_internal+0x2c/0xb0
[ 2540.469131] [<ffffff8008dadcb4>] napi_gro_receive+0x15c/0x188
[ 2540.474734] [<ffffff800894dd90>] eqos_napi_poll_rx+0x358/0x430
[ 2540.480770] [<ffffff8008daf2e4>] net_rx_action+0xf4/0x358
[ 2540.486024] [<ffffff8008081054>] __do_softirq+0x13c/0x3b0
[ 2540.491708] [<ffffff80080baf38>] run_ksoftirqd+0x48/0x58
[ 2540.497219] [<ffffff80080e07c8>] smpboot_thread_fn+0x160/0x248
[ 2540.502999] [<ffffff80080dbe64>] kthread+0xec/0xf0
[ 2540.507811] [<ffffff80080838a0>] ret_from_fork+0x10/0x30
[ 2540.513244] SMP: stopping secondary CPUs
[ 2540.517357] Kernel Offset: disabled
[ 2540.521018] Memory Limit: none
[ 2540.524083] trusty-log panic notifier - trusty version Built: 12:18:19 Oct 16 2020 [ 2540.539169] Rebooting in 5 seconds..
????Shutdown state requested 1
Rebooting system ...
??

The corresponding system log see file:
xavier.log (290.0 KB)

hello tnger,

may I know what’s the modification you’ve done to trigger this kernel panic?
it looks you’ve running some program to interrupt the system, may I know the details to repo this,
thanks

Sorry,I just customized the rootfs, nothing else was modified. This phenomenon happens randomly and it is not certain when or what program will trigger it. The phenomenon may be network related, as my own programs find that the network connection is disconnected or the connection times out when the reboot phenomenon occurs.

hello tnger,

please rule out the steps to reproduce the failure.
could you please also enable a terminal and keep it running $ dmesg --follow to gather the details.
thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.