Hi Jerry,
We tested the R36.5 bootloader-only OTA/capsule flow you suggested using the -b option. The board is a fused Secure Boot / ROOTFS_AB Jetson Orin Nano/NX-class setup booting from NVMe.
The package generation was based on this bootloader-only OTA flow:
sudo -E ./tools/ota_tools/version_upgrade/l4t_generate_ota_package.sh \
--external-device nvme0n1 \
-b \
jetson-orin-nano-devkit \
R36-5
In our secure-boot setup, we also supplied the PKC and UEFI key options and used the generated bootloader-only OTA payload package on target.
Before running OTA, the board was clean:
Capsule update status: 0
Current bootloader slot: A
Active bootloader slot: A
slot 0: normal
slot 1: normal
Current rootfs slot: A
Active rootfs slot: A
rootfs slot 0: normal, retry_count 3
rootfs slot 1: normal, retry_count 3
We then ran nv_ota_start.sh on target. The script completed staging successfully and reported:
ROOTFS_AB_ENABLED=1
ROOTFS_CURRENT_SLOT=0
UPDATE_SLOT=B
UPDATE_BOOTLOADER=1, UPDATE_ROOTFS=0
trigger_uefi_capsule_update /ota_work /dev/nvme0n1
Copying /ota_work/TEGRA_BL.Cap into /opt/nvidia/esp/EFI/UpdateCapsule
Triggering UEFI capsule update ... OsIndications...
Bootloader on non-current slot(B) is to be updated once device is rebooted
After reboot, the board came back normally on chain A:
Capsule update status: 1
Current bootloader slot: A
Active bootloader slot: A
slot 0: normal
slot 1: normal
rootfs A/B both normal
So at that point, the capsule appeared to have been consumed successfully.
However, since the OTA script said non-current slot B was updated, we then tried to verify B by running:
sudo nvbootctrl set-active-boot-slot 1
sudo reboot
The serial console showed:
Rebooting to new boot chain
But the next successful boot still came up on chain A, and nvbootctrl reported:
Capsule update status: 1
Current bootloader slot: A
Active bootloader slot: A
slot 0: normal
slot 1: unbootable
Rootfs A/B was still healthy:
Current rootfs slot: A
Active rootfs slot: A
slot 0: retry_count 3, status normal
slot 1: retry_count 3, status normal
So the bootloader-only capsule update appears to have made the non-current bootloader slot B unbootable. The board recovered back to A, but B is now marked unbootable.
This looks similar to a previous failure we saw where the opposite slot became unbootable after an OTA/capsule test. Our current hypothesis is that the OTA capsule updates the non-current chain, but the updated chain does not boot correctly on this fused Secure Boot / ROOTFS_AB configuration. If the system then attempts to boot that updated chain, it fails and the bootloader slot gets marked unbootable.
Do you have any suggestions for what to inspect next?
Some specific questions:
1. Is "Capsule update status: 1" definitely the success state for this flow?
2. After a bootloader-only OTA capsule update, is the updated non-current bootloader slot expected to be immediately bootable via:
sudo nvbootctrl set-active-boot-slot <slot>
sudo reboot
3. Are there additional secure-boot signing/capsule certificate requirements beyond PKC and UEFI DB signing that could make the capsule apply but leave the updated slot unbootable?
4. Is there a recommended way to dump or decode the persistent boot-chain status metadata that marks a slot unbootable?
5. Would you expect a surgical reflash of B_cpu-bootloader to clear this state, or is a full reflash required to reset the persistent boot-chain metadata?
6. Should a bootloader-only OTA package for a fused Secure Boot / ROOTFS_AB NVMe boot system include any additional options or artifacts beyond the standard -b --external-device nvme0n1 flow?
We captured serial logs for the reboot and pre/post nvbootctrl state this time, so we can provide details if useful.
Thanks,
Alin