AGX Xavier (CPU near 100% load) self reboot problem

I have a AGX Xavier with JetPack 4.4 installed.

When I turn on my unit, the CPU 1 loading starting with reasonable 10% ~ 20% without running any apps. Overtime, the CPU 1 loading increases. When it hit 100%, the system self reboot. There is no application running. Just the tegrastats runs in background to log this behaviour. Sometime it takes few hours to reach self reboot. (like this morning). Sometime it takes only few minutes to self reboot without running any apps. Whenever the self reboot happens, check the tegrastats log, most of time they show the same symptom: CPU 1 loading 100% or near to 100% (see Background notes below)

Attached please find the serial console log CPU_0_100_self_reboot.log (483.1 KB) and tegrastats log.CPU_0_100_self_reboot_tegrastats.log (2.2 MB) for this morning self reboot event.

What should I do to fix this problem? Please advise. Thanks.

Edit: the 2nd time self reboot today is about 30 mins after the first one stated above, still the symptom is CPU 1 with 100% load. But the 3rd time self reboot about 30 mins after the 2nd self reboot, this time the CPU 1 loading is < 10% (actually the log shows all cpu loading are 0% when self reboot, it can’t be!?) but triggered by other issues(?) show in the serial console log self_reboot_3.log (193.9 KB) and aslo the tegrastats log self_reboot_3_tegrastats.log (192 KB) attached.

Background: every time I thought I nail it, the next moment the self reboot happen again seems with quite different reason. Just FYI: I had been suspecting Thermal issue (not clear yet), Power supply issue (not clear yet) , Network driver issue (not clear yet), and now showing up as CPU 1 100% loading seems I can nail it, but the third self boot happens with CPU 1 < 10%… Sigh ; Interestingly, in retrospect, if you check thru all the 3 posts (thermal, power, network) tegrastats log, the CPU 1 loading are all close to 100% (if not, it’s 99% or so). No clues what the root cause is ; (

Edit2: today’s 4th self reboot, same symptoms: CPU 1 load near 100%, this time survives almost 3 hours, but still crash. Attached please find the self reboot console log self_reboot_after_nvpmodel_cool_4.log (193.5 KB) and tegrastats log self_reboot_after_nvpmodel_cool_4_tegrastats.log (2.9 MB)

Looks duplicated with AGX Xavier easy to crash when ethernet network connected