Booting into recovery mode after continuous reboots - AGX Xavier on 5.1.2

I recently updated an AGX Xavier to 5.1.2 and have observed odd behaviour after continuously rebooting the unit by logging in, running reboot and repeating as soon as the shell comes up.
After three reboots, the system will boot into recovery mode and the UEFi menu shows that slot A has been marked as unbootable. Setting this back to ‘normal’ allows normal operation to resume.
I have been able to reproduce this on two different systems. Is this intended behaviour? If it is, is it configurable at all?

Hi,

Thus, your operations is like

  1. boot into the rootfs and AGX Xavier shell comes up.
  2. reboot the device using commands sudo reboot
  3. repeated until the slot A marked unbootable

Do you observe any kernel panic?
Please attach record a serial console log for us to review.

Thanks

Hi David. Yes, those are the exact operations taken. We observe this after 3 reboots, as shown in the log attached.
1.0_RECOVERY_MODE_TESTING_DEFAULT_IMAGE.log (308.1 KB)

No kernel panic occurs and I can not see any obvious error in the logs.
The issue seen here runs successfully within 10 seconds of systemd starting up. We’re now testing if this issue occurs if we reboot after the nv-l4t-bootloader-config service succeeds.

Our assumption is that rebooting too quickly means this service never gets to set a UEFI variable, causing the system to believe it has not booted successfully.

If we reboot after this service succeeds, no issue occurs. In our bad-case scenario, nvbootctrl verify is never called and seems to be causing this issue. It occurring after three reboots points towards the ROOTFS_RETRY_MAX_COUNT=3 that is default during flashing.
Is it possible to configure this without flashing? As we do not have an additional boot device, we never want the slot to be marked as unbootable.

Hi,

You could add this config in /opt/nvidia/l4t-bootloader-config/nv-l4t-bootloader-config.service

 After=nv.service
 After=nvgetty.service
 After=l4t-rootfs-validation-config.service
+Before=shutdown.target reboot.target halt.target

And test whether the issue exist

Thanks

Thanks for the suggestion, David.
We opted to add a service with
After: sysinit.target
to manually call ‘nvbootctrl verify’ as we were also able to reproduce this behavior by cutting power.
Cheers!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.