The summary is in this post
It links to all the posts I had filed on this issue. It seems all the issues so far I had filed linked to one symptoms (not the root cause) which the CPU 1 loading is inching up all the way to 100% overtime or near to 100% and then crash.
It could be (my guessing) some part the system keep firing irq and inundate the CPU (that is the load is getting higher and higher over time). The suspected part (could be s/w or h/w) are:
- power management: bpmp, etc.
- network: eqos, etc.
- gpu : nvgpu, etc.
eventually causing CPU stalled, then kernel panic - not syncing: softlockup
Well that’s my two cents guessing, but no clue what causing these symptoms. My setup is extremely simple (Display+keybord+mouse+ethernet) no other sensors. The unit uses the 65W power supply come with the product and plug into a 600W line conditioner exclusive for AGX Xavier only (no other device plug in). Power mode setting MAXN and “sudo nvpmodel -d cool” to keep the fan running. The system can still self reboot without any apps running. Yesterday for example, turn on around 9:00am, self reboot around 12:15 noon, then 2nd self reboot around 12:45pm (still nothing running), then 3rd self reboot around 1:15pm (still no apps running) and 4th self reboot around 4:30pm. All the console logs and tegrastats logs can be found in this post
Thank you for your following up.