I am currently encountering an issue. After enabling the AB partitions and performing multiple OTAs, I found that one of the partition status is marked as unbootable. How can I set the status of this partition back to Normal?
run@agi:~$ sudo nvbootctrl dump-slots-info
Current version: 36.4.3
Capsule update status: 0
Current bootloader slot: B
Active bootloader slot: B
num_slots: 2
slot: 0, status: unbootable
slot: 1, status: normal
may I know what’s your steps to reproduce this issue? did you have image-based OTA or capsule update?
besides.. nvbootctrl by default to show the bootloader slots info.
per your comments.. After enabling the AB partitions , you’ve enable ROOTFS_AB=1, right?
We enabled the ROOTFS_AB=1 configuration and performed several OTA updates. Suddenly, we found that we could not switch to the target partition anymore. Checking the information with nvbootctl showed it as “unbootable.” How can I quickly clear this status to perform another OTA update? The faulty device is at the customer site, and we want to restore it to a normal state before upgrading again. Are there any commands available for this?
In my OTA update package, I have configured it to upgrade both the rootfs and bootloader simultaneously. However, even after multiple OTA attempts, this status still cannot be cleared.
please try to use the nvbootctrl command to mark the unbootable partition as bootable.
for instance, you can try.. $ sudo nvbootctrl mark-bootable 0, to mark slot 0 as bootable.
It seams like nvbootctl doesn’t support mark-bootable options:
$ sudo nvbootctrl mark-bootable 0
nvbootctrl - command-line wrapper for the boot_control HAL.
Usage:
nvbootctrl [Options] Command
Options:
-t - target available: bootloader or rootfs. Default: bootloader.
Commands:
get-number-slots - Prints number of slots.
get-current-slot - Prints currently running SLOT.
set-active-boot-slot SLOT - On next boot, load and execute SLOT.
dump-slots-info - Prints info for slots.
verify - Verify the bootloader and rootfs boot.
verify-bl - Verify the bootloader boot only.
is-rootfs-ab-enabled - Rootfs only. Return 0 if rootfs A/B is disabled;
Return 1 if rootfs A/B is enabled, current slot is A;
Return 2 if rootfs A/B is enabled, current slot is B;
BTW, you may try entering UEFI menu to update slot status.
for instance,
Device Manager → NVIDIA Configuration → L4T Configuration → OS chain A status → change to Normal
I was just making routine changes to the rootfs, then building an OTA package every few days or once a week and flashing it to the board as usual. All of a sudden, after one particular update, the board failed to boot—the partition had become unbootable.
Can I fix it without going into UEFI? The board has already been assembled and shipped to the customer, so we have no way to reach the serial console. I’d really like to complete the repair entirely from within Linux—just run a few commands and be done.
we may need to understand the error logs for the root cause.
for instance, if you’re missing binary file, it’ll not trigger updating process.
it shall report following error logs from target side.
FmpTegraCheckImage: Missing required image for partition mb1: Not Found
FmpDxe(NVIDIA System Firmware): CheckTheImage() - FmpDeviceLib CheckImage failed. Status = Aborted
FmpDxe(NVIDIA System Firmware): SetTheImage() - Check The Image failed with Aborted.
unfortunately, there’s no single commands to reset slot info.
you’ll need to identify the root cause, updating binary file to address failure, and then entering UEFI to reset the slot info.
I’ve already obtained the problematic board. I attempted to re-OTA it, and the upgrade completed without any errors. Attached are the OTA upgrade log and the serial-port log captured after the successful upgrade and reboot. Please help me figure out why the “unbootable” status still can’t be cleared.
I> Task: Ratchet update
W> Skip ratchet update - OPTIN fuse not set
I> Task: Prepare eeprom data
E> I2C: slave not found in slaves.
E> I2C: Could not write 0 bytes to slave: 0x00ae with repeat start true.
E> I2C_DEV: Failed to send register address 0x00000000.
E> I2C_DEV: Could not read 256 registers of size 1 from slave 0xae at 0x00000000 via instance 0.
E> eeprom: Retry to read I2C slave device.
E> I2C: slave not found in slaves.
E> I2C: Could not write 0 bytes to slave: 0x00ae with repeat start true.
E> I2C_DEV: Failed to send register address 0x00000000.
E> I2C_DEV: Could not read 256 registers of size 1 from slave 0xae at 0x00000000 via instance 0.
E> eeprom: Failed to read I2C slave device
C> Task 0x0 failed (err: 0x1f1e050d)
E> Top caller module: I2C_DEV, error module: I2C, reason: 0x0d, aux_info: 0x05
I> Busy Spin
��
due to the failure, it’s fallback to slot-B and marking slot-A as unbootable.
are you working with a customized carrier board? it’s the error reported regrading to eeprom,
please check developer guide, MB2 configuration changes.
you may have the following modification in MB2-BCT file (i.e. tegra234-mb2-bct-common.dtsi) when the carrier board is designed without an EEPROM.
I’m using a customized carrier board with an Orin NX 16 GB module. Out of several thousand units shipped, only one has shown this EEPROM-related error. Can I conclude this is a hardware issue?
Under our current configuration we have already changed it to 0.: