BSP 35.6.2 reflashing slot-B with a modified version of slot-A (except the PARTUUID) does not boot

After getting A/B flashing to work (see issue BSP 35.6.2 flashing for ROOTFS_AB=1 fails ), I tried to get a duplicate of slot-A into slot-B, after slot-A is oem-configured and some extra bits are added to it.

The idea is to keep slot-B with the same identities as slot-A and as close to it as possible.

So slot-A boots and works fine.

If I then:

  1. Read the slot-A partition into a file:
    ROOTFS_AB=1 sudo -E --preserve-env=ROOTFS_AB ./flash.sh -S 28GiB -r -k APP -G app_backup_a.img jetson-agx-xavier-industrial mmcblk0p1

  2. Then change the PARTUUID to be the one for slot-B:
    sudo mount app_backup_a.img.raw /mnt

    sudo sed -i -re ‘s/PARTUUID=[^ ]+ /PARTUUID=’$(cat bootloader/l4t-rootfs-uuid.txt_b)’ /’ /mnt/boot/extlinux/extlinux.conf
    sudo sync -f /mnt/boot/extlinux/extlinux.conf
    sudo umount /mnt

  3. Write back app_backup_a.img.raw to slot-B:
    sudo rm app_backup_a.img
    sudo bootloader/mksparse --fillpattern=0 app_backup_a.img{.raw,}
    ROOTFS_AB=1 sudo -E --preserve-env=ROOTFS_AB ./flash.sh -S 28GiB -k APP_b --no-systemimg -–image app_backup_a.img jetson-agx-xavier-industrial mmcblk0p1

Then switch to slot-B after the jetson reboots (sudo nvbootctrl set-active-boot-slot 1), The boot stucks just when the kernel is going to be launch:

Jetson UEFI firmware (version 6.2-40633251 built on 2025-05-16T01:35:20+00:00)
ESC   to enter Setup.
F11   to enter Boot Manager Menu.
Enter to continue boot.
**  WARNING: Test Key is used.  **
......
L4TLauncher: Attempting Direct Boot
EFI stub: Booting Linux Kernel...
EFI stub: Using DTB from configuration table
EFI stub: Loaded initrd from LINUX_EFI_INITRD_MEDIA_GUID device path
EFI stub: Exiting boot services and installing virtual address map...
... nothing else happens...

Doing more flash attempts, I get sometimes a bit of garbled text, one time it managed to print a clean set of messages before stopping:

��ERROR:   MPIDR 0x80000000: exception reason=0 syndrome=0xbe000000
ERROR:   **************************************
ERROR:   RAS Error in L2, ERRSELR_EL1=0x200:
ERROR:          Status = 0xfc00640d
ERROR:          IERR = SCF to L2 Decode Error Read: 0x64
ERROR:   SERR = Illegal address (software fault): 0xd
ERROR:          Overflow (there may be more errors) - Uncorrectable
ERROR:          Uncorrectable (this is fatal)
ERROR:          MISC0 = 0x200000000100000
ERROR:          MISC1 = 0x805c000000c
ERROR:          ADDR = 0x8000000000001f20
ERROR:   **************************************
ERROR:   **************************************
ERROR:   RAS Error in SCF_IOB, ERRSELR_EL1=0x401:
ERROR:          Status = 0xf4009604
ERROR:          IERR = CBB Interface Error: 0x96
ERROR:   SERR = Assertion failure: 0x4
ERROR:          Uncorrectable (this is fatal)
ERROR:          MISC0 = 0x40
ERROR:          MISC1 = 0x3204a0c44c1
ERROR:          ADDR = 0x8000000000001f20
ERROR:   **************************************
ERROR:   RAS error handled!
ERROR:   sdei_dispatch_Unhandled Exception in EL3.
x30            = 0x000000004000f184
x0             = 0x0000000000000065
x1             = 0x000000000c198000
x2             = 0x000000008100005f
x3             = 0x00000000af4d6c04
x4             = 0x0000000000000000
x5             = 0x0000000000000405
x6             = 0x0000000000000008
x7             = 0x0000000000000000
x8             = 0x0000000040009ae0
x9             = 0x0000000000000000
x10            = 0x000000000000073d
x11            = 0xffffcf11455b0000
x12            = 0xffffcf1147676a80
x13            = 0xffffcf1147676e30
x14            = 0xffffffffffffffff
x15            = 0xffffcf1147353a30
x16            = 0x00000007429ecb60
x17            = 0x00000000bce02581
x18            = 0xffffcf11473534c0
x19            = 0x000000004001c960
x20            = 0x00000000ffffff80
x21            = 0x0000000000000065
x22            = 0x00000000400167c0
x23            = 0x0000000040016bc0
x24            = 0x0000000000000001
x25            = 0x0000000000000000
x26            = 0x0000000040010c24
x27            = 0xffffcf11473689d0
x28            = 0x0= 0x0000000000000000
tpidr_el1      = 0x0000000000000000
tpidr_el0      = 0x00000000b8000000
tpidrro_el0    = 0x0000000000000000
par_el1        = 0xff000000a0a75b00
mpidr_el1      = 0x0000000080000000
afsr0_el1      = 0x0000000000000000
afsr1_el1      = 0x0000000000000000
contextidr_el1 = 0x0000000000000000
vbar_el1       = 0x0000000000000000
cntp_ctl_el0   = 0x0000000000000000
cntp_cval_el0  = 0x000000004fa4cfa7
cntv_ctl_el0   = 0x0000000000000000
cntv_cval_el0  = 0x0000000000000000
cntkctl_el1    = 0x0000000000000000
sp_el0         = 0x0000000040016a40
isr_el1        = 0x0000000000000000
dacr32_el2     = 0x0000000000000000
ifsr32_el2     = 0x0000000000000000
actlr_el1      = 0x0000000000000001
gicc_hppir     = 0x00000000000003ff
gicc_ahppir    = 0x00000000000003ff
gicc_ctlr      = 0x0000000000000069
gicd_ispendr regs (Offsets 0x200 - 0x278)
 Offset:                        value
0000000000000200:               0x0000000000000000
0000000000000204:               0x0000000000000000
0000000000000208:               0x0000000000000000
000000000000020c:               0x0000000000000000
0000000000000210:               0x0000000000000000
0000000000000214:               0x0000000000000000
0000000000000218:               0x0000000000000000
000000000000021c:               0x0000000000000000
0000000000000220:               0x0000000000000000
0000000000000224:               0x0000000000000000
0000000000000228:               0x0000000000000000
000000000000022c:               0x0000000000000000
0000000000000230:               0x0000000000000000
0000000000000234:               0x0000000000000000
0000000000000238:               0x0000000000000000
000000000000023c:               0x0000000000000000
0000000000000240:               0x0000000000000000
0000000000000244:               0x0000000000000000
0000000000000248:               0x0000000000000000
000000000000024c:               0x0000000000000000
0000000000000250:               0x0000000000000000
0000000000000254:               0x0000000000000000
0000000000000258:               0x0000000000000000
000000000000025c:               0x0000000000000000
0000000000000260:               0x0000000000000000
0000000000000264:               0x0000000000000000
0000000000000268:               0x0000000000000000
000000000000026c:               0x0000000000000000
0000000000000270:               0x0000000000000000
0000000000000274:               0x0000000000000000
0000000000000278:               0x0000000000000000
000000000000027c:               0x0000000000000000

And at some flashing attempts, it manages to boot, but with some filesystem errors.

Most attempts do not get filesystem errors if I go back to the slot-A (via UEFI setup), and run sudo fsck -f /dev/mmcblk0p2.

I also tried the process above a bit more like the flash.sh does it, using a loop device and using a double tar to copy from the downloaded partition file to a new fresh one, just in case, but nothing appears to change.

So something is wrong, but cannot tell what, other than this used to work with BSP 32.7.1 (lots of things get broken now… not sure if I should blame the UEFI or if there is some hidden thing that is not documented or I don’t know about).

Anyideas?

hello david.fernandez,

you should Manually Generate a Root File System before flashing a target with RootfsA/B.

Hi @JerryChang,

There are things that are not easy to generate before flashing… like the oem-config, and some other service installs…

Anyway, why would that not work properly?

I’ll try the docker way that is mentioned in your link, just in case there is some other incompatibility in the ext4 filesystem driver…

But I did also try to do the same thing in the jetson itself, replacing the /dev/mmcblk0p2 with the new filesystem, and the result is similar.

Tested with docker to modify the raw read partition and no change.

Even reflashing the already generated bootloader/system.img_b does not work.

So it appears that the –image xxx -k APP_b function is broken.

Using the debug version of UEFI instead shows nothing new:

ESC   to enter Setup.
F11   to enter Boot Manager Menu.
Enter to continue boot.
Failed to find memory test protocol
**********************************
**  WARNING: Test Key is used.  **
**********************************
**  WARNING: Test Key is used.  **
......UpdatePcieControllersWithGpuDevice: failed to enumerate GPU device handles: Not Found
InstallFdt: Installing Kernel DTB
Processing "L4T Configuration Settings" DTB overlay
Deleting fragment fragment@0
Processing "Jetson AGX Xavier Industrial Overlay Support" DTB overlay
UpdateRamOopsMemory: RamOopsBase: 0x0, RamOopsSize: 0x0
FtpmProtocol Not Found - Not Found
UpdatePcieControllersWithGpuDevice: failed to enumerate GPU device handles: Not Found
[Bds]Booting UEFI SD Device
ERROR: C40000002:V03051002 I0 6D33944A-EC75-4855-A54D-809C75241F6C 3E11C038
UpdatePcieControllersWithGpuDevice: failed to enumerate GPU device handles: Not Found
[Bds]Booting UEFI eMMC Device
add-symbol-file /home/david.fernandez/src_sen/vcpu/NL4T_LinuxForTegra/uefi_build/Build/Jetson/DEBUG_GCC5/AARCH64/Silicon/NVIDIA/Application/L4TLauncher/L4TLauncher/DEBUG/L4TLauncher.dll 0x734B8C000
Loading driver at 0x00734B8B000 EntryPoint=0x00734B95EB0 L4TLauncher.efi

L4TLauncher: Attempting Direct Boot
Processing "L4T Configuration Settings" DTB overlay
Deleting fragment fragment@0
Processing "Jetson AGX Xavier Industrial Overlay Support" DTB overlay
UpdateRamOopsMemory: RamOopsBase: 0x0, RamOopsSize: 0x0
FtpmProtocol Not Found - Not Found
UpdatePcieControllersWithGpuDevice: failed to enumerate GPU device handles: Not Found
Loading driver at 0x00732550000 EntryPoint=0x00733F3D34C
Loading driver at 0x00732550000 EntryPoint=0x00733F3D34C 
root=PARTUUID=b289c3d4-9963-404c-94e6-e03c9052d429 rw rootwait rootfstype=ext4 mminit_loglevel=4 console=ttyTCU0,115200n8 console=tty0 fbcon=map:0 net.ifnames=0 rootfstype=ext4 video=efifb:off ExtLinuxBoot: Cmdline: 
EFI stub: Booting Linux Kernel...
EFI stub: Using DTB from configuration table
EFI stub: Loaded initrd from LINUX_EFI_INITRD_MEDIA_GUID device path
EFI stub: Exiting boot services and installing virtual address map...
DisableVbus: Couldn't Disable Regulator: 29 for USB Port: 0
DisableVbus: Couldn't Disable Regulator: 29 for USB Port: 3
PROGRESS CODE: V03101019 I0 D6A2CB7F-6A18-4E2F-B43B-9920A733700A
... nothing else happens ...

Right, I have run some more tests…

Seems that reading A partition, replacing the rootfs filesystem with its contents, and flashing it into B, does not work either.

Even modifying the other slot partition from within Jetson, breaks it.

The only thing that works at the moment is to reading slot-A, replacing rootfs with its contents, then flash the whole jetson again.

Clearly that shows that what used to work in previous BSPs, and what appears a reasonable expectation, i.e. reading a slot partition and writing it to another slot, is now broken, and only re-flashing the whole Jetson works.

There is clearly some operation missing in flash.sh when reflashing a partition.

I wonder if there is a “feature” that this is due to, or if it is just a bug that should be fixed.

Anything on this?

hello david.fernandez,

could you please have an alternative ways for using backup/restore script file,
for instance, $OUT/Linux_for_Tegra/tools/backup_restore

may I know which Jetpack release you had tested?
besides.. please also share your steps for reference, thanks

Hi @JerryChang,

I’ll look into the backup_restore thing in a little while.

I am using the BSP 35.6.2 tarball with the Sample Root Filesystem tarball for it, all from the BSP 35.6.2 download page: Jetson Linux | NVIDIA Developer

I enumerated the steps in bullet points here: BSP 35.6.2 reflashing slot-B with a modified version of slot-A (except the PARTUUID) does not boot

I’ll add the steps using the root filesystems after flash in order to duplicate below… I’ll just want to repeat them again, as I still see instabilities when flashing that way, but basically:

  1. Flash using the standard method (except changing flash_xxx_rootfs_ab.xml to allow usage of the full 64GB of flash in the Jetson AGX Xavier industrial).
  2. Go through the oem-config after flash for the slot-A.
  3. Read back the slot-A using the ./flash.sh -r -k -G.
  4. Move the standard rootfs folder away.
  5. Mount that slot-A raw file and copy the whole rootfs to the host into rootfs folder using a double tar (like flash.sh does. (I guess cp -a should do too).
  6. Rerun standard flash.sh with the read back rootfs.

Seems that even that ends up with an unstable system… it boots fine, but some things fail, like less command not using all the terminal lines, when it did before, the usb device mode not working etc.

So I’ll repeat again just to be sure.

Right, steps in more detail for the method that appears to work (as root, after flashing the standard way, as in step 10, and going through oem-config in slot-A):

  1. Read back slot-A:
    ROOTFS_AB=1 ./flash.sh -S 28GiB -r -k APP -G app_backup_a.img jetson-agx-xavier-industrial mmcblk0p1

  2. Move the standard rootfs folder away:
    rm -rf rootfs/*

  3. Mount the slot-A raw file:
    mkdir -p /tmp/tmp_app_a; mount app_backup_a.img.raw /tmp/tmp_app_a

  4. Copy the whole rootfs to the host into rootfs folder:
    tar -C /tmp/tmp_app_a --xattrs --xattrs-include='*' -cpf - --exclude='tmp/*' $(ls -1A /tmp/tmp_app_a | sed -re'/=$/ d; s/[*/>@|]?$//') | tar -C rootfs --xattrs --xattrs-include='*' -xpf - --checkpoint=.20000

  5. Set the extlinux that the standard rootfs expects:
    cp -f bootloader/extlinux.conf rootfs/boot/extlinux/

  6. Rerun standard flash.sh with the read back rootfs:
    ROOTFS_AB=1 ./flash.sh -S 28GiB jetson-agx-xavier-industrial mmcblk0p1

Seems that everything works, except that I see strange behaviour for less on the serial port (does not happen on ssh or telnet) that I did not see on the first flash (perhaps buffering or something has changed as I see corrupt output when doing less -SN <file> now).

I tried that, and it seems also broken:

$ ROOTFS_TYPE="ext4 -O ^metadata_csum_seed,^orphan_file" ROOTFS_AB=1 sudo -E --preserve-env=ROOTFS_TYPE,ROOTFS_AB tools/backup_restore/l4t_backup_restore.sh -e mmcblk0p1 -b jetson-agx-xavier-industrial
/home/david.fernandez/src_sen/vcpu/NL4T_LinuxForTegra/tools/kernel_flash/l4t_initrd_flash_internal.sh --no-flash --initrd --showlogs jetson-agx-xavier-ind-stv-a-ep mmcblk0p1
******************************************
*                                        *
*  Step 1: Generate rcm boot commandline *
*                                        *
******************************************
ROOTFS_AB= ROOTFS_ENC= /home/david.fernandez/src_sen/vcpu/NL4T_LinuxForTegra/flash.sh  --no-flash --rcm-boot jetson-agx-xavier-ind-stv-a-ep mmcblk0p1
###############################################################################
# L4T BSP Information:
# R35 , REVISION: 6.2
# User release: 0.0
###############################################################################
ECID is 0x88021911644501c2100000000cff8280
copying soft_fuses(/home/david.fernandez/src_sen/vcpu/NL4T_LinuxForTegra/bootloader/t186ref/BCT/tegra194-mb1-soft-fuses-l4t.cfg)... done.
./tegraflash.py --chip 0x19 --applet "/home/david.fernandez/src_sen/vcpu/NL4T_LinuxForTegra/bootloader/mb1_t194_prod.bin" --skipuid --soft_fuses tegra194-mb1-soft-fuses-l4t.cfg --bins "mb2_applet nvtboot_applet_t194.bin" --cmd "dump eeprom boardinfo cvm.bin;reboot recovery" 
Welcome to Tegra Flash
version 1.0.0
Type ? or help for help and q or quit to exit
Use ! to execute system commands
...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for device to expose ssh ......Error: ipv6: address already assigned.
Error: ipv6: address already assigned.
Device has booted into initrd. You can ssh to the target by the command:
$ ssh root@fe80::1%enp0s20f0u2u3
Cleaning up...
Log is saved to Linux_for_Tegra/initrdlog/flash_1-2.3_0_20251021-152432.log 
chown: warning: '.' should be ':': ‘root.root’
Run command: 
ln -s /proc/self/fd /dev/fd && mount -o nolock [fc00:1:1::1]:/home/david.fernandez/src_sen/vcpu/NL4T_LinuxForTegra/tools/backup_restore /mnt && /mnt/nvbackup_partitions.sh -e mmcblk0p1 -n  && echo Backup image is stored in /home/david.fernandez/src_sen/vcpu/NL4T_LinuxForTegra/tools/backup_restore/images
 on root@fc00:1:1::2
cat: /sys/block/mmcblk0p1/size: No such file or directory

Apparently, that /sys/block/mmcblk0p1/size should be /sys/block/mmcblk0/mmcblk0p1/size.

There are more instances of that in the restore_partitions.sh.

Also, I wonder if things will work when it decides to get rid of ROOTFS_AB=1, as it shows at the beginning, as the partition table used will be wrong… it might work for mmcblk0p1, but not for mmcblk0p2, I guess.

I put the ROOTFSSIZE=28GiB in the board configuration file for the ROOTFS_AB=1 cases, so even that flash command should work if it keeps the ROOTFS_AB=1.

Would be great if you guys provide a fixed tools/backup_restore/* utilities, so that I don’t have to figure out everything there.

hello david.fernandez,

please also refer to rel-36 developer guide, Cloning rootfs with initrd.
you may refer to the instructions on cloning the rootfs (system.img ).
after booting into initrd, ensure that the /etc/fstab file does not contain any disk-specific UUIDs, as this may cause mount failures on other devices.
the fstab file included in the clone system.img should be same as a clean BSP.

Hi @JerryChang,

I’ll have a look into the cloning thing.

Seems that the backup_restore -e device has to be the full disk/eMMC, rather than a partition, so I misunderstood that before.

The /etc/fstab does not contain UUIDs at all, but it needs to contain extra entries to mount external sd-cards, and temp filesystems, but all the entries are referring to the specific devices or files, not UUIDs.

I wonder if you are referring to the extlinux.conf, rather than /etc/fstab.

The extlinux.conf Is explicitly replaced with the clean one from the bootloader folder, that does not contain specific entries for the partitions, so flash.sh can modify it as usual.

Hi @JerryChang,

Tried l4t_backup_restore.sh (properly) this time.

It works as abackup and restore… but if I duplicate the mmcblk0p1 ro mmcblk0p2 (copying the tar.gz file, then unpacking, modifying extlinux.con, and replacking), it goes wrong (seems to omit the delete of blocks when restoring mmcblk0p2), and eventually the boot fails when mounting the partition from inside initrd.

So seems that there is something secret that prevents modifying partitions…

I’ll try the cloning thing next, see if that allows me to modify a partition, or what.

Scratch, that… apparently, sometimes, it was booting from an external sd-card… even if the present switch was off… that is why it appeared to fail boot.

Right, I CAN use l4t_backup_restore.sh to duplicate a partition (copying the tar.gz archive for it, and replacing the extlinux.conf, then recalculating the sha256sum and updating it in the nvpartitionmap.txt in images).

It gives a scary error on umount taking too much time

It also seems to affect some marginal thing in the uart, similar to my method of replacing rootfs and reflashing.

Tried also the clone method.

Seems to work, just similarly as the other two, there is some slight change in the debug uart (ttyUSBx) timing or control characters (have not seen the TERM change, so I’ll have to investigate that a bit more).

In any case, nVidia should fix the flash.sh so it can do properly what it used to do before… but at least there are new methods that I can use to keep going.