Rootfs A/B Why is the retry_count not decremented during Cold reboots

When nvbootctrl verify is not executed after booting and the device is then power-cycled, the retry_count (retrieved via nvbootctrl dump-slots-info -t rootfs is not decremented).
So even if this is repeated it does not lead to the device switching the boot chain.

You confirmed that this is the intended behaviour here: Orin Nano AB Booting Not working - #14 by JerryChang .

We would expect that the retry_count gets decremented even for power-cycles/cold boots.
Can you explain why this is not the case?
(Maybe I missed it, but it also seems that this behaviour is not documented in your Developer Guide)

hello jonathan.schnitzler,

let me double check which Jetpack release version you’re working with now?
please also share your reproduce steps for reference.

Hi,
we are on JetPack 6.1/Jetson Linux 36.4.1
To reproduce:

  • Boot device, ensure nvbootctrl verify is not run
  • Run nvbootctrl dump-slots-info -t rootfs
  • Turn off power supply
  • Repeat

retry_count reported by nvbootctrl dump-slots-info -t rootfs will not get decremented and the slot will be never marked as unbootable.

hello jonathan.schnitzler,

normally, if one root file system slot is unbootable, the kernel watchdog will reboot the device, and if fails to boot into it for consecutive 3 times, then the UEFI try to boot from another rootfs slot.

you should corrupted the file system for testing rootfs-A/B.
for instance, you may using dd command to erase the APP_A partition.
i.e. $ sudo dd if=/dev/zero of=/dev/nvme0n1p1 bs=1M count=1

Hi,
thank you for answering the question, but this does not answer my question at all.
If you think this should be the case, can you explain how your answer relates to my question?

hello jonathan.schnitzler,

this is cold reboot, whereas the retry_count decremented is triggered by software (i.e. warm-reboot).
so.. you should try with root file system at unbootable state for testing rootfs-A/B.

this is cold reboot, whereas the retry_count decremented is triggered by software (i.e. warm-reboot).

Correct. And my question in this whole thread is: What is the reason that Nvidia decided to only decrement the retry_count for warm reboots and not for cold reboots?

hello jonathan.schnitzler,

there’s a background service, nv_update_engine.
If the root filesystem fails to boot, make sure to reboot the device, so that the nv_update_engine will not update boot status to successful.