A/B root flashing of Nvidia Jetson Xavier with L4T 35.5.0

With the following set of commands I am able to flash the internal mmblk and the built-in Samsung 1TB SSD:

cd /home/.../nvidia/R35.5.0/Linux_for_Tegra
sudo ROOTFS_AB=1 ROOTFS_RETRY_COUNT_MAX=3 ./flash.sh jetson-agx-xavier-devkit mmcblk0p1 

# reboot and put into recovery mode !

sudo ./tools/kernel_flash/l4t_initrd_flash.sh --erase-all --external-device nvme0n1p1 -c ./tools/kernel_flash/flash_l4t_nvme.xml -S 900GiB --showlogs jetson-agx-xavier-devkit mmcblk0p1

# reboot and put into recovery mode !

sudo ROOTFS_AB=1 ROOTFS_RETRY_COUNT_MAX=3 ./flash.sh jetson-agx-xavier-devkit nvme0n1p1

After this the ssd also has the A/B slot feature enabled.

  • Is there a simpler way then doing this 3 consecutive flashes?
  • Can I skip the middle flash command for instance?
  • Do I need the erase-all flag in the middle command?
  • Also did set ./tools/kernel_flash/flash_l4t_nvme.xml` num_sectors to 1953525168 (for Samsung 1 TB). Is this necessary or would above commands recognize this?

I do not understand why these 2 commands do not work:

cd /home/.../nvidia/R35.5.0/Linux_for_Tegra
sudo ROOTFS_AB=1 ROOTFS_RETRY_COUNT_MAX=3 ./flash.sh jetson-agx-xavier-devkit mmcblk0p1 

sudo ./tools/kernel_flash/l4t_initrd_flash.sh --external-device nvme0n1 -c ./tools/kernel_flash/flash_l4t_nvme_ab.xml -S 900GiB --showlogs jetson-agx-xavier-devkit mmcblk0p1

The last command of the 2 always fails already in the first step of image building. Should this not work as well?

Hi y1thu,

Are you using Xavier NX or AGX Xavier in your case?
The devkit or custom carrier board?

Do you want to use internal eMMC or external NVMe as boot device?

It is necessary to configure the num_sectors since you are using 1TB NVMe SSD.
But maybe you should modify flash_l4t_nvme_rootfs_ab.xml since you want to enable redundant rootfs.

Please share the full flash log when you run this command.
Maybe you should use flash_l4t_nvme_rootfs_ab.xml instead of flash_l4t_nvme_ab.xml.

Hi KevinFFF,

I would like to create a A/B rootfs on nvme = Samsung 1 TB. I would like to boot from mmcblk01. It is an nvidia Xavier AGX devkit.
I did try with : flash_l4t_nvme_rootfs_ab.xml (sorry was a typo above). Did not work.

My understanding is that I have to issue:

sudo ./tools/kernel_flash/l4t_initrd_flash.sh --external-device nvme0n1 -c ./tools/kernel_flash/flash_l4t_nvme_rootfs_ab.xml -S 900GiB --showlogs jetson-agx-xavier-devkit mmcblk0p1

which will partition the nvme if A/B root. Will this also split up the boot partition into A/B.

Then issuing:

sudo ./flash.sh jetson-agx-xavier-devkit nvme0n1p1

Will tell the board to boot from A or B on nvme and mount the A B root from nvme. Is this understanding correct?

Do I need to add ROOTFS_AB=1 ROOTFS_RETRY_COUNT_MAX=3 to any of the commands above? maybe the second?

Are you using Xavier NX or AGX Xavier?

Yes, this command should work in your case.
Please share the full flash log for further check.

You don’t need to run this command.

These 2 options are both not needed since the default retry count is 3 and you’ve specified using flash_l4t_nvme_rootfs_ab.xml as partition layout to enable rootfs a/b.

Dear KevinFFF,

here is what I tried:

  1. set the num_sectors to 1953525168 in ./tools/kernel_flash/flash_l4t_nvme_rootfs_ab.xml; since
    aaeon@aaeon-xavier:~$ sudo fdisk -l /dev/nvme0n1
    Disk /dev/nvme0n1: 931.53 GiB, 1000204886016 bytes, 1953525168 sectors
  2. ran the command:
sudo ./tools/kernel_flash/l4t_initrd_flash.sh --external-device nvme0n1 -c ./tools/kernel_flash/flash_l4t_nvme_rootfs_ab.xml -S 450GiB --showlogs jetson-agx-xavier-devkit mmcblk0p1

It did not work with -S 900GiB of course, since this creates an APP and APP_b partition of 900GiB each which does not fit the SSD. Instead -S 450GiB needs to be used which sets the size of APP and APP_b to 450GiB adding up to the allowed 900GiB. This behaviour is nowhere documented.
3) Then I ran into the error.log as attached.
error.log (115.4 KB)
This shows 2 problems:

/home/thomas/nvidia/R35.5.0/Linux_for_Tegra/flash.sh: line 3235: python: command not found

So I replaced python in line3235 by python3.
Second more severe problem: [ 1.0735 ] File boot.img_b open failed. It turns out the boot.img_b is never generated in :

thomas@nuci7-lan:~/nvidia/R35.5.0/Linux_for_Tegra/bootloader$ ls -alh boot*
-rw-r--r-- 1 root   root   46M Aug 17 21:59 boot0.img
-rw-r--r-- 1 root   root   42M Aug 17 21:59 boot.img
lrwxrwxrwx 1 thomas thomas   8 Aug 17 21:59 boot.img_b -> boot.img
-rw-r--r-- 1 root   root   42M Aug 17 21:58 boot.img.sb

So I created a simlink boot.img_b to boot.img, watching in a 2nd terminal whenever the bootloader directory is newly populated. I do not know if this is correct? and it needs to be definitely be fixed in the script somewhere.
This simlink is also copied to external:

thomas@nuci7-lan:~/nvidia/R35.5.0/Linux_for_Tegra/bootloader$ ls -alh /home/thomas/nvidia/R35.5.0/Linux_for
_Tegra/tools/kernel_flash/images/external
total 3.3G
drwxr-xr-x 2 root   root   4.0K Aug 17 21:58 .
drwxr-xr-x 4 root   root   4.0K Aug 17 21:58 ..
-rw-r--r-- 1 root   root    42M Aug 17 21:54 boot.img
lrwxrwxrwx 1 thomas thomas    8 Aug 17 21:54 boot.img_b -> boot.img
-rw-r--r-- 1 root   root    64M Aug 17 21:58 esp.img
-rw-r--r-- 1 root   root     78 Aug 17 21:58 flash.cfg
-rw-r--r-- 1 root   root   1.9K Aug 17 21:58 flash.idx
-rw-r--r-- 1 root   root    17K Aug 17 21:58 gpt_primary_9_0.bin
-rw-r--r-- 1 root   root    17K Aug 17 21:58 gpt_secondary_9_0.bin
-rw-r--r-- 1 root   root   385K Aug 17 21:54 kernel_tegra194-p2888-0001-p2822-0000.dtb
-rw-r--r-- 1 root   root    512 Aug 17 21:58 mbr_9_0.bin
-rw-r--r-- 1 root   root    45M Aug 17 21:54 recovery.img
-rw-r--r-- 1 root   root   1.7G Aug 17 21:58 system.img
-rw-r--r-- 1 root   root   1.5G Aug 17 21:58 system.img_b
-rw-r--r-- 1 root   root     41 Aug 17 21:58 system.img_b.sha1sum
-rw-r--r-- 1 root   root     41 Aug 17 21:58 system.img.sha1sum
-rw-r--r-- 1 root   root   385K Aug 17 21:54 tegra194-p2888-0001-p2822-0000.dtb.rec

The initrd runs through successfully with this “hack”. See success.log.
success.log (246.4 KB)

Everything boots up ok but:

aaeon@aaeon-xavier:~$ sudo nvbootctrl -t rootfs dump-slots-info
RootFS A/B is not enabled.

The partition layout is as expected:

aaeon@aaeon-xavier:~$ sudo fdisk -l /dev/nvme0n1
Disk /dev/nvme0n1: 931.53 GiB, 1000204886016 bytes, 1953525168 sectors
Disk model: Samsung SSD 970 EVO Plus 1TB            
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 77DCDAAB-F93B-4D3E-8BC7-27028FC04067

Device              Start        End   Sectors  Size Type
/dev/nvme0n1p1    2682408  946400807 943718400  450G Microsoft basic data
/dev/nvme0n1p2  946400808 1890119207 943718400  450G Microsoft basic data
/dev/nvme0n1p3         40     262183    262144  128M Microsoft basic data
/dev/nvme0n1p4     262184     263719      1536  768K Microsoft basic data
/dev/nvme0n1p5     263720     328487     64768 31.6M Microsoft basic data
/dev/nvme0n1p6     328488     590631    262144  128M Microsoft basic data
/dev/nvme0n1p7     590632     592167      1536  768K Microsoft basic data
/dev/nvme0n1p8     592168     656935     64768 31.6M Microsoft basic data
/dev/nvme0n1p9     656936     820775    163840   80M Microsoft basic data
/dev/nvme0n1p10    820776     821799      1024  512K Microsoft basic data
/dev/nvme0n1p11    821800    1436199    614400  300M Microsoft basic data
/dev/nvme0n1p12   1436200    1567271    131072   64M EFI System
/dev/nvme0n1p13   1567272    1731111    163840   80M Microsoft basic data
/dev/nvme0n1p14   1731112    1732135      1024  512K Microsoft basic data
/dev/nvme0n1p15   1732136    1863207    131072   64M Microsoft basic data
/dev/nvme0n1p16   1863208    2682407    819200  400M Microsoft basic data
aaeon@aaeon-xavier:~$ sudo fdisk -l /dev/mmcblk0
Disk /dev/mmcblk0: 29.13 GiB, 31272730624 bytes, 61079552 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 31C505AC-095F-4220-A57F-F725571D7964

Device             Start      End  Sectors   Size Type
/dev/mmcblk0p1        40 58720295 58720256    28G Microsoft basic data
/dev/mmcblk0p2  58720296 58720615      320   160K Microsoft basic data
/dev/mmcblk0p3  58720616 58728807     8192     4M Microsoft basic data
/dev/mmcblk0p4  58728808 58736999     8192     4M Microsoft basic data
/dev/mmcblk0p5  58737000 58738023     1024   512K Microsoft basic data
/dev/mmcblk0p6  58738024 58743143     5120   2.5M Microsoft basic data
/dev/mmcblk0p7  58743144 58743271      128    64K Microsoft basic data
/dev/mmcblk0p8  58743272 58746343     3072   1.5M Microsoft basic data
/dev/mmcblk0p9  58746344 58748391     2048     1M Microsoft basic data
/dev/mmcblk0p10 58748392 58748711      320   160K Microsoft basic data
/dev/mmcblk0p11 58748712 58752807     4096     2M Microsoft basic data
/dev/mmcblk0p12 58752808 58756903     4096     2M Microsoft basic data
/dev/mmcblk0p13 58756904 58758951     2048     1M Microsoft basic data
/dev/mmcblk0p14 58758952 58759207      256   128K Microsoft basic data
/dev/mmcblk0p15 58759208 58923047   163840    80M Microsoft basic data
/dev/mmcblk0p16 58923048 58924071     1024   512K Microsoft basic data
/dev/mmcblk0p17 58924072 58989607    65536    32M Microsoft basic data
/dev/mmcblk0p18 58990592 58990911      320   160K Microsoft basic data
/dev/mmcblk0p19 58990912 58999103     8192     4M Microsoft basic data
/dev/mmcblk0p20 58999104 59007295     8192     4M Microsoft basic data
/dev/mmcblk0p21 59007296 59008319     1024   512K Microsoft basic data
/dev/mmcblk0p22 59008320 59013439     5120   2.5M Microsoft basic data
/dev/mmcblk0p23 59013440 59013567      128    64K Microsoft basic data
/dev/mmcblk0p24 59013568 59016639     3072   1.5M Microsoft basic data
/dev/mmcblk0p25 59016640 59018687     2048     1M Microsoft basic data
/dev/mmcblk0p26 59018688 59019007      320   160K Microsoft basic data
/dev/mmcblk0p27 59019008 59023103     4096     2M Microsoft basic data
/dev/mmcblk0p28 59023104 59027199     4096     2M Microsoft basic data
/dev/mmcblk0p29 59027200 59029247     2048     1M Microsoft basic data
/dev/mmcblk0p30 59029248 59029503      256   128K Microsoft basic data
/dev/mmcblk0p31 59029504 59193343   163840    80M Microsoft basic data
/dev/mmcblk0p32 59193344 59194367     1024   512K Microsoft basic data
/dev/mmcblk0p33 59194368 59260927    66560  32.5M Microsoft basic data
/dev/mmcblk0p34 59260928 59424767   163840    80M Microsoft basic data
/dev/mmcblk0p35 59424768 59425791     1024   512K Microsoft basic data
/dev/mmcblk0p36 59425792 59442175    16384     8M Microsoft basic data
/dev/mmcblk0p37 59442176 59458559    16384     8M Microsoft basic data
/dev/mmcblk0p38 59458560 60072959   614400   300M Microsoft basic data
/dev/mmcblk0p39 60072960 60073215      256   128K Microsoft basic data
/dev/mmcblk0p40 60073216 60204287   131072    64M EFI System
/dev/mmcblk0p41 60204288 60368127   163840    80M Microsoft basic data
/dev/mmcblk0p42 60368128 60369151     1024   512K Microsoft basic data
/dev/mmcblk0p43 60369152 60500223   131072    64M Microsoft basic data
/dev/mmcblk0p44 60500224 61079511   579288 282.9M Microsoft basic data

sudo grep -r "LNXFILE" . reveals that no script sets LNXFILE_b which is used in ./tools/kernel_flash/flash_l4t_nvme_rootfs_ab.xml:

./flash.sh:LNX_TAG+="-e s/LNXFILE/${localbootfile}/ ";
./odmfuse.func: CFGCONV+="-e s/LNXFILE/${localbootfile}/ ";

This points to a possible solution. Just replace LNXFILE_b in ./tools/kernel_flash/flash_l4t_nvme_rootfs_ab.xml by LNXFILE. Indeed runs through as above, when using the ln -s command.
Is this correct, that the same boot.img should be used on A and B?
Why does sudo nvbootctrl -t rootfs dump-slots-info report that A/B rootfs is not enabled?

On the other hand when running:

aaeon@aaeon-xavier:/home$ sudo nvbootctrl dump-slots-info
Current version: 35.5.0
Capsule update status: 0
Current bootloader slot: A
Active bootloader slot: A
num_slots: 2
slot: 0,             status: normal
slot: 1,             status: normal

So the slots are available in the booloader and not in the rootfs. Is this correct?

It does not seem to work correctly. Since when I delete /lib after 3 retries it starts booting differently but it ends up not booting at all.

I think the above command is wrong, and thus using instead, here it seems that LNXFILE_b does indeed exist, so left this intact in the xml:

sudo ROOTFS_AB=1 ./tools/kernel_flash/l4t_initrd_flash.sh \
      --external-device nvme0n1 \
      -S 450GiB \
      -c ./tools/kernel_flash/flash_l4t_nvme_rootfs_ab.xml \
      --showlogs --erase-all \
      jetson-agx-xavier-devkit \
      external

This does flash the nvme and mmcblk device as expected, both have an APP and APP_b partition of same size. Also this:

aaeon@aaeon-xavier:~$ sudo nvbootctrl dump-slots-info
Error: open fail errno = 2 reason = No such file or directory 
Current version: 0.0.2
Error: open fail errno = 2 reason = No such file or directory 
Error: open fail errno = 2 reason = No such file or directory 
Error: open fail errno = 2 reason = No such file or directory 
Capsule update status: 0
Current bootloader slot: A
Active bootloader slot: A
num_slots: 2
slot: 0,             status: normal
slot: 1,             status: normal
aaeon@aaeon-xavier:~$ sudo nvbootctrl -t rootfs dump-slots-info
Current rootfs slot: A
Active rootfs slot: A
num_slots: 2
slot: 0,             retry_count: 3,             status: normal
slot: 1,             retry_count: 3,             status: normal

seems more correct.
What does Error: open fail errno = 2 reason = No such file or directory mean?

However when I delete /lib and then it retries 3 times it reports to switch the bootchain but is unable to boot. Why?

It seems that I can switch to another active slot, but it does not become the current after reboot. Why?
aaeon@aaeon-xavier:~$ sudo nvbootctrl set-active-boot-slot 1
After reboot:

aaeon@aaeon-xavier:~$ sudo nvbootctrl get-current-slot
0
aaeon@aaeon-xavier:~$ sudo nvbootctrl -t rootfs dump-slots-info
Current rootfs slot: A
Active rootfs slot: B
num_slots: 2
slot: 0,             retry_count: 3,             status: normal
slot: 1,             retry_count: 3,             status: normal
aaeon@aaeon-xavier:~$ sudo nvbootctrl dump-slots-info
Error: open fail errno = 2 reason = No such file or directory 
Current version: 0.0.2
Error: open fail errno = 2 reason = No such file or directory 
Error: open fail errno = 2 reason = No such file or directory 
Error: open fail errno = 2 reason = No such file or directory 
Capsule update status: 0
Current bootloader slot: A
Active bootloader slot: B
num_slots: 2
slot: 0,             status: normal
slot: 1,             status: normal
aaeon@aaeon-xavier:~$ sudo nvbootctrl verify
Info: variable BootChainFwStatus is not found.

It seems you have several trials and attempt to fix the error you hit.
Please note that the a/b bootchain is enabled by default but the redundant roofts(a/b) is disabled by default.
Maybe you still need to add ROOTFS_AB=1 to enable rootfs a/b on NVMe.

Please share the full serial console log for the boot issue.

It is not the expected result to me that the version showing 0.0.2.

Please verify to switch the slot manually after you flash the board, you should be able to switch boot chain through nvbootctrl command.

It seems you have several trials and attempt to fix the error you hit.

This is the command I used finally , so deviating in two things from the command you proposed: -S 450 G (not 900G) and using external instead of mmcblk:

sudo ROOTFS_AB=1 ./tools/kernel_flash/l4t_initrd_flash.sh \
      --external-device nvme0n1 \
      -S 450GiB \
      -c ./tools/kernel_flash/flash_l4t_nvme_rootfs_ab.xml \
      --showlogs --erase-all \
      jetson-agx-xavier-devkit \
      external

Maybe you still need to add ROOTFS_AB=1 to enable rootfs a/b on NVMe.

As you can see I used the flag above. But root A/B not working correctly due to errors mentioned.

Please share the full serial console log for the boot issue.

Never did this before. Do I need to connect a serial cable to the Jetson?

It is not the expected result to me that the version showing 0.0.2 .

What should it be? What about the “no such file or directory” error?

Please verify to switch the slot manually after you flash the board, you should be able to switch boot chain through nvbootctrl command.

Please read carefully. I was able to switch the active slot, however not the current. Why?

Please refer to the instruction in NVIDIA Jetson Xavier - Serial Console (ridgerun.com) to capture serial console log from AGX Xavier devkit.

It should be the exact same version as slot A like r35.5.0

It seems like efi variable missing in your case.

You can run sudo nvbootctrl set-active-boot-slot 1 does not mean that you can switch the slot successfully.
Please check the serial console log after you run this command and sudo reboot to switch slot.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.