Hi, I’m a laboratory technician of UPV university (Spain). We’ve 7x AGX Xavier for developing. Now we’ve a problem with one of them (bought June’21). Without load the system auto-reboot, randomly. If you execute “dmesg --follow” you can see in the console before the reboot:
[ 1084.572288] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 1084.572522] 0-…: (1 GPs behind) idle=c81/2/0 softirq=11518/11537 fqs=145
[ 1084.572669] (detected by 1, t=5754 jiffies, g=3618, c=3617, q=133)
[ 1084.572815] Task dump for CPU 0:
[ 1084.572825] swapper/0 R running task 0 0 0 0x00000002
[ 1084.572842] Call trace:
[ 1084.572877] [] __switch_to+0x9c/0xc0
[ 1084.572896] [] cpuidle_enter_state+0xa0/0x380
[ 1084.572904] [] cpuidle_enter+0x34/0x48
[ 1084.572915] [] call_cpuidle+0x44/0x70
[ 1084.572923] [] cpu_startup_entry+0x1b0/0x200
[ 1084.572937] [] rest_init+0x84/0x90
[ 1084.572956] [] start_kernel+0x370/0x384
[ 1084.572964] [] __primary_switched+0x80/0x94
Info of the system:
Ubuntu 18.04.6 LTS
Linux xavier 4.9.140-tegra #1 SMP PREEMPT Tue Oct 27 21:02:46 PDT 2020 aarch64 aarch64 aarch64 GNU/Linux
cat /etc/nv_tegra_release:
R32 (release), REVISION: 4.4, GCID: 23942405, BOARD: t186ref, EABI: aarch64, DATE: Fri Oct 16 19:37:08 UTC 2020
lshw:
description: Computer
product: Jetson-AGX
serial: 1421921012841
lscpu:
Architecture: aarch64
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 1
Core(s) per socket: 2
Socket(s): 4
Vendor ID: Nvidia
Model: 0
Model name: ARMv8 Processor rev 0 (v8l)
Stepping: 0x0
CPU max MHz: 2265,6001
CPU min MHz: 115,2000
BogoMIPS: 62.50
L1d cache: 64K
L1i cache: 128K
L2 cache: 2048K
L3 cache: 4096K
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp
Please, any help?
Best regards.