Losing access to (one or two) Xavier boards after rebooting

Hello,
When I reboot nebra using aurixreset, in some of the tries I see that Xavier boards boot but then fails to respond after a few seconds. I can’t ping it and also access it via the debug cable. There is nothing interesting in /var/log/syslog and also nothing fails in calling aurixreset.

I thought that this might be somehow related to the aurix firmware or the hardware.

To update the firmware I went through this document and tried:

  • Making all .sh files in /etc/systemd/scripts/ executable because some of them were not executable by default
  • Running sudo /bin/bash /etc/systemd/scripts/nv_aurix_check_fw.sh -auto_update but it failed:
$ sudo /bin/bash /etc/systemd/scripts/nv_aurix_check_fw.sh -auto_update
[sudo] password for nvidia: 
starting Aurix FW checking...
Checking arguments...
No_Response_From_Aurix
--------------------------------------------------------------------
!!! IMPORTANT INFORMATION BELOW REGARDING Aurix FW (PLEASE READ) !!!
 
No response from Aurix FW. It seems
vlan iface not configured properly
Please check your nv_tacp_init systemd service
 
THIS MESSAGE WILL CONTINUE TO BE SHOWN UNTIL ACTION MENTIONED IS TAKEN
With action taken, this script will be muted and not shown again
----------------------------------------------------------------------
 

Also in aurix shell I can see some errors reported using showsafetylogs. not sure if they are relevant:

NvShell>showsafetylogs
Info: Executing cmd: showsafetylogs, argc: 0, args: 

Tegra and MCU 3LSS version mismatch for Device ID 0!!!
Device ID 2,  FuSa State Changed : FUSA_ERROR_STATE 

Error Notification received
Device ID : 2
Extended Error Id : 0x3100011
EEP Timestamp : 549755814016
Error Status : ERROR DETECTED
Error Info Size: 0
Error Info : 

Error Notification received
Device ID : 2
Extended Error Id : 0x3100012
EEP Timestamp : 549755814016
Error Status : ERROR DETECTED
Error Info Size: 0
Error Info : 

Unable to read Startup FuSa State 

Tegra and MCU 3LSS version mismatch for Device ID 1!!!
Device ID 2,  FuSa State Changed : FUSA_ERROR_STATE 

Safety Services FuSa State : FUSA_ERROR_STATE 

Unable to read Startup FuSa State 

Device ID 2,  FuSa State Changed : FUSA_ERROR_STATE 
Command Executed

I’ve also seen some forum posts that were using showvoltages command but seems this is not available in my aurix shell. so I have no way to check chips voltages. Might be related to my current firmware version(DRIVE-V5.2.0-E3550-AFW-Aurix-With3LSS-StepA-4.02.02)?

Please provide the following info (check/uncheck the boxes after clicking “+ Create Topic”):
Software Version
DRIVE OS Linux 5.2.0
DRIVE OS Linux 5.2.0 and DriveWorks 3.5
NVIDIA DRIVE™ Software 10.0 (Linux)
NVIDIA DRIVE™ Software 9.0 (Linux)
other DRIVE OS version
other

Target Operating System
Linux
QNX
other

Hardware Platform
NVIDIA DRIVE™ AGX Xavier DevKit (E3550)
NVIDIA DRIVE™ AGX Pegasus DevKit (E3550)
other

SDK Manager Version
1.6.0.8170
1.5.1.7815
1.5.0.7774
other

Host Machine Version
native Ubuntu 18.04
other

Hi @ebrahim4o8qh ,
Are you on DRIVE OS 5.2.0 or DRIVE Software 10.0? Why does the firmware not match?

Hi @VickNV ,

~$ cat /etc/nvidia/version-ubuntu-rootfs.txt 
5.1.6.1-16902563

I am not sure but I think at some point this machine was flashed with driveos-5.2 for testing purpose and then it was flashed back to driveos-5.1.6. so if “DRIVE-V5.2.0” at the start of “DRIVE-V5.2.0-E3550-AFW-Aurix-With3LSS-StepA-4.02.02” means it is for driveos-5.2, that may be the reason. but flashing it with 5.1.6 doesn’t mean that it also pushes the related firmware? I also can see the same firmware on another machine that was initially flashed with driveos-5.2 and then 5.1.6. but on another machine that was always flashed with 5.1.6 the version command shows this: “SW Version: DRIVE-V5.1.6-E3550-EB-Aurix-With3LSS-ForHyperion-StepA-3.05.04”

Please refer to Downgrade DRIVE OS 5.2.0 to DRIVE Software 10. It should be able to solve the problem. Thanks.

1 Like

Thanks for the answer.