Xavier with JP4.2 hangs

Hello,
We have issues with our AGX Xavier becoming slow, unresponsive and eventually hanging completely for few tens of seconds. The issue seems to happen randomly, there seems to be no way to reliably induce this behavior. Xavier is flashed fresh with JetPack 4.2. We have only one kit so we can’t compare if this is expected behavior but I assume that it is not. There is no external hardware connected apart from keyboard, mouse, usb-c adapter and ethernet cable. Below are dmesg logs from few bootups when this issue occured. Is our board broken?

https://pastebin.com/R34KiQjm
https://pastebin.com/KMCDjyqB
https://pastebin.com/RvSxPD0D
https://pastebin.com/kLMpsEFs

Somehow I cannot view pastebin page. Could you share it in google drive or something else?

Thank you for your response, I added logs as attachments
xavier-dmesg.txt (258 KB)
xavier-dmesg2.txt (124 KB)
xavier-dmesg3.txt (86.8 KB)
xavier-dmesg4.txt (149 KB)

Hi aleksander,

You provided four dmesg logs. Do you observe the issue with all the four logs?

Yes, those are example dmesg logs when the problem occured

INFO: rcu_sched detected stalls on CPUs/tasks:
INFO: rcu_preempt self-detected stall on CPU
I have got these messages with custom kernel too.

I’m checking with our teams to see if any relevant experience.
I’ll update here.

Hi, Please try with the patch and let me know if it resolved your issue.
0001-ethernet-eqos-fix-lockup-due-to-SOFTIRQ-unsafe.patch.zip (2.35 KB)

Hi, I patched kernel but following messages still appear

INFO: rcu_sched detected stalls on CPUs/tasks:
INFO: rcu_preempt self-detected stall on CPU

Is it easy to reproduce? Do you observe it on another Xavier DevKit?

I didn’t try another Xavier.
It happens when I use nvarguscamerasrc element in gstreamer.
And it happens in 10~15 minutes after launching gstreamer pipeline.

following fix this problem.

sudo jetson_clocks

CPU core#0 usage becomes extremely high without jetson_clocks.
I used perf command and found that cpuidle has something to this.
I’d thought that jetson_clocks just do clock hz settings and enable cpu cores,
but it disable cpuidle too.

Good to know fixed. BTW, What’s the gstreamer pipeline?

Just like this.
gst-launch-1.0 nvarguscamerasrc ! nvvidconv ! xvimagesink

Using xvimagesink in the pipeline isn’t optimal. Which causes background memory copying.
Please refer to below pipeline from ACCELERATED GSTREAMER
USER GUIDE
for best performance.

gst-launch-1.0 nvarguscamerasrc ! ‘video/x-raw(memory:NVMM),
width=(int)1920, height=(int)1080, format=(string)NV12,
framerate=(fraction)30/1’ ! nvoverlaysink -e