Is there a known issue related to A/B Redundancy and Initrd Flash(NVMe)? Jetpack 5.1.1

Hey,

I have a Xavier NX devkit and am using the flash.sh script to flash to SD card.
To flash to NVMe I use the initrd flash.

SD:

ADDITIONAL_DTB_OVERLAY=BootOrderInternal.dtbo ROOTFS_AB=1 ROOTFS_RETRY_COUNT_MAX=3 ./flash.sh -S 6GiB p3509-0000+p3668 mmcblk0p1  

NVMe:

ADDITIONAL_DTB_OVERLAY=BootOrderNvme.dtbo ROOTFS_AB=1 ROOTFS_RETRY_COUNT_MAX=3 ./tools/kernel_flash/l4t_initrd_flash.sh             --external-device nvme0n1p1             -S 20GiB             --showlogs             -c ./tools/kernel_flash/flash_l4t_nvme_rootfs_ab.xml             --network usb0             p3509-0000+p3668-nvme external

After flashing to SD card I can switch in UEFI between NVMe and SD and both A/B slots work.

When I flash to NVMe and switch the slot with nvbootctrl I am unable to switch the boot slot. The tool shows the slot only as changed until the reboot, after reboot slot A is booted. The same if I switch to sd card.

I thougt I’d try replacing only the QSPI with the flash.sh then:

sudo ADDITIONAL_DTB_OVERLAY=BootOrderInternal.dtbo ROOTFS_AB=1 ROOTFS_RETRY_COUNT_MAX=3 ./flash.sh -S 6GiB jetson-xavier-nx-devkit-qspi mmcblk0p1

For some reason that did not make A/B work again even though I had expected it.

Is this a known issue? Does anyone else notice the same?

Hi,
so you mean the A/B redundancy works when A/B slots reside in SD Card and NVMe drive respectively,
but fails when both of them are in the SD Card/NVMe drive?
Can you share what command you used to flash them?

Hey @DaveYYY the commands are above.

A/B switching works when I flash QSPI+SD with flash.sh
A/B switching does NOT work when I flash QSPI+NVMe with l4t_initrd_flash.sh

Hi,
sorry that I’m still not quite sure about the scenario.

After flashing to SD card I can switch in UEFI between NVMe and SD and both A/B slots work.

So both NVMe and SD Card were flashed with A/B slots, and they co-existed, but only the one on SD card worked?

The same if I switch to sd card.

What does this mean?

Okay I try to explain it more.

The devkit has an SD card attached to it and a NVMe SSD.
I flashed both with A/B Systems 1 by 1 with the respective commands above.

I can use the UEFI to switch the primary boot target. So I tried the A/B mechanism with both flash methods in combination with both boot targets.

When I used initrd flash, A/B does not work whether I boot from NVMe or SD.

When I used flash.sh, A/B works for both NVMe and SD.
So I can exclude any issues related to the rootfs. The problem must be added by the way the initrd flash created or flashed the QSPI part of the system.

Hi, sorry for asking more.

Did you mean that you originally have both systems on SD Card and NVMe SSD work with A/B redundancy, but it broke after you re-flashed the NVMe SSD with initrd flash, and it worked again after SD Card was re-flashed with flash.sh.

Or you meant, assuming both storage were empty,

  • Flashing SD Card with flash.sh first, and then flashing NVMe SSD with initrd flash failed.
  • Flashing NVMe SSD with initrd flash first, and then flashing SD Card with flash.sh worked.

Can you have a flowchart here?

Well, it worked sometime, but that was before release 5.1.1
With 5.1.1 I can’t seem to get l4t_initrd_flash.sh flash a working A/B system anymore.

Flashing always works. Just the A/B System does not work.

Xavier NX Devkit with SD Card and NVMe plugged in

→ flash SD with flash.sh → test A/B on SD → works

→ reflash NVMe with initrd_flash.sh (SD untouched) → test A/B on NVMe → does not work
→ switch to SD boot in UEFI → test A/B on SD → does not work

→ reflash SD again with flash.sh (NVMe untouched) → test A/B on SD → works
→ switch to NVMe boot in UEFI → test A/B on NVMe → works

As you guys tend to ask me this. In the meantime I tried flashing with an unmodified version of L4T 5.1.1, the issue persists.

Hi,
thanks for the information.
We’ll see if we can re-produce this issue on our side.

Please use this command, it should work with your default BSP:

sudo ROOTFS_AB=1 ROOTFS_RETRY_COUNT_MAX=3 ./tools/kernel_flash/l4t_initrd_flash.sh --external-device nvme0n1p1 -S 10GiB --showlogs -c ./tools/kernel_flash/flash_l4t_nvme_rootfs_ab.xml --network usb0 jetson-xavier-nx-devkit-qspi external

Xavier NX Devkit with p3668-0001 (SD Card Module)

Hi,
it’s confirmed that we can reproduce this issue on our side.
We are now tracking it internally, and will notify you when there’s any update.

Hey @DaveYYY thanks for the info.
Since we use Orin devices too, can you comment if this affects only Xavier or is a general issue?

Hi,

I’ve tested on AGX Orin, and the same situation did not happen on it.
After some discussion with our team, looks like it may be caused by some misuse of flashing command, and I’m still waiting for them to give a reply.