Atomic bootloader update on TX1: TBoot ignores BFS1 when RP1 is present in GPT

I am trying to update tegra X1 in production environment from L4T 24.2.1 to L4T 32.6.1.

I was able to write BFS1 to boot 1
NXC-1
PT-1
TXC-1
RP1-1
EBT-1
WX0-1
BXF-1
And then I wrote KFS1 to user
DTB-1
TOS-1
EKS-1
LNX-1

Now when I write BCT it boots TBoot from NXC-1, which is good.
But then TBoot sees a RP1 partition on emmc uses GPT, and completely ignores BFS1 and RP1-1 partition on mmcblk0boot1, and just fails to validate bootloader DTB.

I can easily replicate this behavior on vanilla L4T - flash the device, than rename any partition(EX: UDA) to RP1. It will fail to boot both BFS0 and BFS1, and will be stuck in recovery.

So I am unable to do an atomic bootloader update because I simultaneously need to update BCT to use NXC-1, and update GPT to remove RP1 partition, which confuses TBoot.

Maybe there is some configuration option for TBoot, to ignore GPT table, and use the PT-1 in boot1? Or at least ignore RP1 and use RP1-1 according to its BFS number?
I clearly see some configuration in Customer data of BCT, but there is no documentation.

UPD: It is reproducing on latest L4T: 32.7.2

  1. sudo ./flash.sh -u secure-boot-rsa-2048bit-private.pem jetson-tx1 mmcblk0p1
  2. ssh to device
  3. sudo sgdisk /dev/mmcblk0 -c 22:RP1
  4. reboot

Result:

[0000.382] Using GPT Primary to query partitions
[0000.387] NvTbootFailControlDoFailover: No failover; Continuing ...
[0000.393] Read PT from (0:3)
[0000.428] PT crc32 and magic check passed.
[0000.432] Using BFS PT to query partitions 
[0000.437] PT: Partition RP5 NOT found ! 
[0000.440] Warning: Find Partition via PT Failed
[0000.445] Load RPB failed, skip RPB.
[0000.448] *** Booting BFS0.
[0000.451] NvTbootFailControlDoFailover: No failover; Continuing ...
[0000.457] PT: Partition LNX NOT found ! 
[0000.461] *** Booting KFS0.
[0000.463] NvTbootFailControlDoFailover: No failover; Continuing ...
[0000.469] Rail not supported
[0000.472] BoardID = 2180, SKU = 0x3e8
[0000.475] Not Nano-SD or !QSPI-ONLY, check GPT table first ...
[0000.481] NvTbootFailControlDoFailover: No failover; Continuing ...
[0000.488] Loading Tboot-CPU binary
[0000.495] Verifying TBC in SecurePKC mode
[0000.504] Bootloader load address is 0xa0000000, entry address is 0xa0000258
[0000.511] Bootloader downloaded successfully.
[0000.515] Downloaded Tboot-CPU binary to 0xa0000258
[0000.520] MAX77620_GPIO1 Configured.
[0000.524] MAX77620_GPIO5 Configured.
[0000.527] CPU power rail is up
[0000.530] CPU clock enabled
[0000.534] Performing RAM repair
[0000.537] Updating A64 Warmreset Address to 0xa00002e9
[0000.542] BoardID = 2180, SKU = 0x3e8
[0000.545] Not Nano-SD or !QSPI-ONLY, check GPT table first ...
[0000.551] Loading NvTbootBootloaderDTB
[0000.568] NvTbootBootloaderDTB is not valid
[0000.572] NvTbootBootloaderDTB partition is corrupted!
[0000.576] *** Set to failover in the next boot ***
[0000.581] NvTbootFailControlSetClobber:
[0000.585] *** Rebooting ***
[0000.179] [L4T TegraBoot] (version 00.00.2018.01-l4t-8728f3cb)
[0000.184] Processing in cold boot mode Bootloader 2
[0000.189] A02 Bootrom Patch rev = 255
[0000.192] Power-up reason: software reset
[0000.196] No Battery Present
[0000.199] pmic max77620 reset reason
[0000.202] pmic max77620 NVERC : 0x0
[0000.205] RamCode = 0
[0000.208] Platform has Ddr4 type ram
[0000.211] max77620 disabling SD1 Remote Sense
[0000.215] Setting Ddr voltage to 1125mv
[0000.219] Serial Number of Pmic Max77663: 0x220fed
[0000.227] Entering ramdump check
[0000.230] Get RamDumpCarveOut = 0x0
[0000.233] RamDumpCarveOut=0x0,  RamDumperFlag=0xe59ff3f8
[0000.238] Last reboot was clean, booting normally!
[0000.243] Sdram initialization is successful 
[0000.247] SecureOs Carveout Base=0x00000000ff800000 Size=0x00800000
[0000.253] Lp0 Carveout Base=0x00000000ff780000 Size=0x00001000
[0000.259] BpmpFw Carveout Base=0x00000000ff700000 Size=0x00080000
[0000.265] GSC1 Carveout Base=0x00000000ff600000 Size=0x00100000
[0000.270] GSC2 Carveout Base=0x00000000ff500000 Size=0x00100000
[0000.276] GSC4 Carveout Base=0x00000000ff400000 Size=0x00100000
[0000.282] GSC5 Carveout Base=0x00000000ff300000 Size=0x00100000
[0000.288] GSC3 Carveout Base=0x000000017f300000 Size=0x00d00000
[0000.304] RamDump Carveout Base=0x00000000ff280000 Size=0x00080000
[0000.310] Platform-DebugCarveout: 0
[0000.313] Nck Carveout Base=0x00000000ff080000 Size=0x00200000
[0000.319] BoardID = 2180, SKU = 0x3e8
[0000.323] Not Nano-SD or !QSPI-ONLY, check GPT table first ...
[0000.328] Read GPT from (0:3)
[0000.374] Csd NumOfBlocks=0
[0000.382] Using GPT Primary to query partitions
[0000.387] NvTbootFailControlDoFailover: Doing failover: NvTboot partition 5 is corrupted (ec=0x14).
[0000.396] NvTbootFailControlDoClobber:
[0000.399] Current BFS=0
[0000.401] Clobber: Current BFS starts at 2048
[0000.406] *** Will failover to BFS 1.
[0000.410] *** Invert Offset 0 from 0x30 to 0xffffffcf
[0000.437] Clobber: returns e=0x0
[0000.440]  - failover successful; Rebooting ...
[0000.228] [L4T TegraBoot] (version 00.00.2018.01-l4t-8728f3cb)
[0000.233] Processing in cold boot mode Bootloader 2
[0000.238] A02 Bootrom Patch rev = 255
[0000.242] Power-up reason: software reset
[0000.245] No Battery Present
[0000.248] pmic max77620 reset reason
[0000.251] pmic max77620 NVERC : 0x0
[0000.255] RamCode = 0
[0000.257] Platform has Ddr4 type ram
[0000.260] max77620 disabling SD1 Remote Sense
[0000.264] Setting Ddr voltage to 1125mv
[0000.268] Serial Number of Pmic Max77663: 0x220fed
[0000.276] Entering ramdump check
[0000.279] Get RamDumpCarveOut = 0x0
[0000.282] RamDumpCarveOut=0x0,  RamDumperFlag=0xe59ff3f8
[0000.287] Last reboot was clean, booting normally!
[0000.292] Sdram initialization is successful 
[0000.296] SecureOs Carveout Base=0x00000000ff800000 Size=0x00800000
[0000.302] Lp0 Carveout Base=0x00000000ff780000 Size=0x00001000
[0000.308] BpmpFw Carveout Base=0x00000000ff700000 Size=0x00080000
[0000.314] GSC1 Carveout Base=0x00000000ff600000 Size=0x00100000
[0000.320] GSC2 Carveout Base=0x00000000ff500000 Size=0x00100000
[0000.326] GSC4 Carveout Base=0x00000000ff400000 Size=0x00100000
[0000.331] GSC5 Carveout Base=0x00000000ff300000 Size=0x00100000
[0000.337] GSC3 Carveout Base=0x000000017f300000 Size=0x00d00000
[0000.353] RamDump Carveout Base=0x00000000ff280000 Size=0x00080000
[0000.360] Platform-DebugCarveout: 0
[0000.363] Nck Carveout Base=0x00000000ff080000 Size=0x00200000
[0000.369] BoardID = 2180, SKU = 0x3e8
[0000.372] Not Nano-SD or !QSPI-ONLY, check GPT table first ...
[0000.378] Read GPT from (0:3)
[0000.423] Csd NumOfBlocks=0
[0000.432] Using GPT Primary to query partitions
[0000.436] NvTbootFailControlDoFailover: No failover; Continuing ...
[0000.442] Read PT from (0:3)
[0000.478] PT crc32 and magic check passed.
[0000.482] Using BFS PT to query partitions 
[0000.486] PT: Partition RP5 NOT found ! 
[0000.490] Warning: Find Partition via PT Failed
[0000.494] Load RPB failed, skip RPB.
[0000.497] *** Booting BFS1.
[0000.500] NvTbootFailControlDoFailover: No failover; Continuing ...
[0000.506] PT: Partition LNX NOT found ! 
[0000.510] *** Booting KFS1.
[0000.513] NvTbootFailControlDoFailover: No failover; Continuing ...
[0000.519] Rail not supported
[0000.521] BoardID = 2180, SKU = 0x3e8
[0000.525] Not Nano-SD or !QSPI-ONLY, check GPT table first ...
[0000.531] NvTbootFailControlDoFailover: No failover; Continuing ...
[0000.538] Loading Tboot-CPU binary
[0000.544] Verifying TBC in SecurePKC mode
[0000.553] Bootloader load address is 0xa0000000, entry address is 0xa0000258
[0000.560] Bootloader downloaded successfully.
[0000.564] Downloaded Tboot-CPU binary to 0xa0000258
[0000.570] MAX77620_GPIO1 Configured.
[0000.573] MAX77620_GPIO5 Configured.
[0000.576] CPU power rail is up
[0000.579] CPU clock enabled
[0000.583] Performing RAM repair
[0000.586] Updating A64 Warmreset Address to 0xa00002e9
[0000.591] BoardID = 2180, SKU = 0x3e8
[0000.595] Not Nano-SD or !QSPI-ONLY, check GPT table first ...
[0000.600] Loading NvTbootBootloaderDTB
[0000.617] NvTbootBootloaderDTB is not valid
[0000.621] NvTbootBootloaderDTB partition is corrupted!
[0000.626] *** Set to failover in the next boot ***
[0000.631] NvTbootFailControlSetClobber:
[0000.634] *** Rebooting ***
[0000.228] [L4T TegraBoot] (version 00.00.2018.01-l4t-8728f3cb)
[0000.233] Processing in cold boot mode Bootloader 2
[0000.238] A02 Bootrom Patch rev = 255
[0000.242] Power-up reason: software reset
[0000.245] No Battery Present
[0000.248] pmic max77620 reset reason
[0000.251] pmic max77620 NVERC : 0x0
[0000.255] RamCode = 0
[0000.257] Platform has Ddr4 type ram
[0000.260] max77620 disabling SD1 Remote Sense
[0000.264] Setting Ddr voltage to 1125mv
[0000.268] Serial Number of Pmic Max77663: 0x220fed
[0000.276] Entering ramdump check
[0000.279] Get RamDumpCarveOut = 0x0
[0000.282] RamDumpCarveOut=0x0,  RamDumperFlag=0xe59ff3f8
[0000.287] Last reboot was clean, booting normally!
[0000.292] Sdram initialization is successful 
[0000.296] SecureOs Carveout Base=0x00000000ff800000 Size=0x00800000
[0000.302] Lp0 Carveout Base=0x00000000ff780000 Size=0x00001000
[0000.308] BpmpFw Carveout Base=0x00000000ff700000 Size=0x00080000
[0000.314] GSC1 Carveout Base=0x00000000ff600000 Size=0x00100000
[0000.320] GSC2 Carveout Base=0x00000000ff500000 Size=0x00100000
[0000.326] GSC4 Carveout Base=0x00000000ff400000 Size=0x00100000
[0000.331] GSC5 Carveout Base=0x00000000ff300000 Size=0x00100000
[0000.337] GSC3 Carveout Base=0x000000017f300000 Size=0x00d00000
[0000.353] RamDump Carveout Base=0x00000000ff280000 Size=0x00080000
[0000.360] Platform-DebugCarveout: 0
[0000.363] Nck Carveout Base=0x00000000ff080000 Size=0x00200000
[0000.369] BoardID = 2180, SKU = 0x3e8
[0000.372] Not Nano-SD or !QSPI-ONLY, check GPT table first ...
[0000.378] Read GPT from (0:3)
[0000.423] Csd NumOfBlocks=0
[0000.432] Using GPT Primary to query partitions
[0000.436] NvTbootFailControlDoFailover: Doing failover: NvTboot partition 5 is corrupted (ec=0x14).
[0000.445] NvTbootFailControlDoClobber:
[0000.449] Current BFS=1
[0000.451] Clobber: Current BFS starts at 8192
[0000.455] *** Will failover to BFS 2.
[0000.459] *** Invert Offset 0 from 0x30 to 0xffffffcf
[0000.494] Clobber: returns e=0x0
[0000.497]  - failover successful; Rebooting ...

Have you performed a complete system flash, or are you just updating bits and pieces? I ask because the R24.x content is incompatible with R28.x, which is also incompatible with R32.x content. You would need to flash everything at least once, and not just parts of the boot environment. You could clone the eMMC first if you worry about losing that, but such a clone would only be for reference since the rootfs of R24.x is incompatible with R32.x.

No, I am updating eMMC by hand using dd.
I’m aware of kernel and rootfs incompatibility, I have custom uboot to atomically switch between two rootfs and kernels.
I have devices in users hands, so I don’t have an ability to flash them using USB.
I’m writting an update script to update bootloaders and kernel atomically, and I have almost succeed in this task.

  1. I wrote BFS-1 to boot1.
  2. I wrote KFS-1 to user
  3. When I update BCT from 64 to 1 it will atomically switch to BFS-1, boot my custom Uboot from KFS-1, and uboot will use new kernel and rootfs.
    Atomic update ensures that no power or other interruption will brick the device, which will be expensive to ship to repair.

Only problem - TBoot from BFS-1 sees RP1 partition on GPT, and ignores PT-1, which is loaded(according to logs)

Using BFS PT to query partitions 
*** Booting BFS1

I hoped that it will at least will search for RP1-1 partition since it’s BFS1, but no.

There always a way to dd everything, or update non-atomically(update BCT, and remove RP1 partition which confuses TBoot), but it will definitely brick some devices in the wild.

I don’t think I’ll be able to answer. However, many of those partitions require being signed prior to installing. If you were to run a flash.sh command without using the option “--no-flash”, then at the end of flash the signed content would be deleted; however, if you flash and don’t actually flash because you’ve used the “--no-flash” option, then there will be a set of partitions to be flashed which have the signature. Are you using the signed versions? You are customizing somewhere which I don’t have the knowledge to say more about whether it should work, but if you have not used a signed version of the partition, then perhaps that is the last needed ingredient.

Yes I’m using signed versions of bootloaders and DTBs.
I am able to boot new bootloaders and Linux after update using dd.
The problem is to do it atomically, without risk of bricking the device.

I have implemented the update process using: https://docs.nvidia.com/jetson/archives/l4t-archived/l4t-3261/index.html#page/Tegra%20Linux%20Driver%20Package%20Development%20Guide/bootloader_update_nano_tx1.html#wwpID0E03E0HA

I do not know of any mechanism to roll back a failed update. The partition failover (a/b redundancy) is intended for that purpose, but I have no experience with that feature and someone from NVIDIA would have to answer regarding a/b redundancy during manual dd install of content.

hello sshmarov,

sorry, over the air update from L4T 24.2.1 to L4T 32.6.1 is not supported.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.