We’re seeing some weird intermittent problems where the boot partition is corrupted after an update.
We are upgrading the OS of units which have initially been tegraflashed using Jetpack 3.2.1 to an OS built using Jetpack 4.4.
Upgrading from 3.2.1 to 4.4 seems to go OK (I’m not totally certain though, I have not tested this as much as the next part).
The problem occurs when doing another OS upgrade, from 4.4 to 4.4.
When we copy the update-image to the unit before upgrading using USB (either SFTP or MTP), there seems to be a ~5% chance that the boot-partition is corrupted during/after the upgrade.
When we don’t actively use the USB-bus, but instead use an update-file located on the EMMC, there’s a ~0.1% chance that the boot-partition gets corrupted.
Our upgrade process is as follows:
- Figure out if we need to update to partition_name or partition_name_b
- Copy new rootfs and kernel image to APP partition
- Copy boot partition content. Since we’re updating a unit tegraflashed using an old Jetpack, we can’t use nv_update_engine. Instead we copy the files from the BUP-payload to the following partitions:
- Call nvbootctrl set-active-boot-slot
When the boot-slot is 1 and the SMD partition is broken, CBoot will fail a number of times, and then fallback to boot-slot 0.
I am certain that the integrity of the update file is OK. We verify the SHA256-checksum of image and binaries before upgrading.
Does anyone have an idea of what might be going on here?
I have attached the log of a failing cboot:
tx2-boot-partition-failure.log (75.4 KB)