Enable A/B Redundancy in Jetson TX2

We are trying to enable the A/B redundancy in TX2. We referred the below linkes and are able to enable it.

https://docs.nvidia.com/jetson/l4t/index.html#page/Tegra%20Linux%20Driver%20Package%20Development%20Guide/bootloader_update_agx_tx2.html

https://developer.ridgerun.com/wiki/index.php?title=How_to_enable_A/B_redundancy_in_TX2

We updated the smd_info.cfg file and executed the “sudo nv_smd_generator smd_info.cfg slot_metadata.bin” command and then flashed out device. Here is a copy of our smd_info.cfg.

 # Copyright (c) 2017-2018, NVIDIA CORPORATION. All rights reserved.
 #
 # Permission is hereby granted, free of charge, to any person obtaining a
 # copy of this software and associated documentation files (the "Software"),
 # to deal in the Software without restriction, including without limitation
 # the rights to use, copy, modify, merge, publish, distribute, sublicense,
 # and/or sell copies of the Software, and to permit persons to whom the
 # Software is furnished to do so, subject to the following conditions:
 #
 # The above copyright notice and this permission notice shall be included in
 # all copies or substantial portions of the Software.
 #
 # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
 # THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
 # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
 # DEALINGS IN THE SOFTWARE.

# SMD metadata information
< VERSION 3 >

#
# Config 1: Disable A/B support (Default)
#

# slot info order is important!
# <priority>    <suffix>     <retry_count>  <boot_successful>
#15                  _a          7               1

#
# Config 2: Enable redundancy support (by removing comments ##)
#
< REDUNDANCY_USER 1 >

# slot info order is important!
# <priority>    <suffix>     <retry_count>  <boot_successful>
15                  _a          7               1
14                  _b          7               1

We checked this with the below commands.

    nvidia@tegra-ubuntu:~$ sudo nvbootctrl dump-slots-info
    magic:0x43424e00,             version: 3             features: 3             num_slots: 2
    slot: 0,             priority: 15,             suffix: _a,             retry_count: 7,             boot_successful: 1
    slot: 1,             priority: 14,             suffix: _b,             retry_count: 7,             boot_successful: 1
    nvidia@tegra-ubuntu:~$ sudo nvbootctrl get-current-slot
    0

The debug log for this case is attached as “normal_working.log”.

In order to test the change in the active slot, we used the dd command to corrupt the _a partition and rebooted the system. We observed two behaviours.

1.We corrupted the mts-bootpack partition using “sudo dd if=examples.desktop of=/dev/mmcblk0p2” command and rebooted the system. The system got booted successfully and the output for nvbootctrl command below.

nvidia@tegra-ubuntu:~$ sudo nvbootctrl dump-slots-info
[sudo] password for nvidia: 
magic:0x43424e00,             version: 3             features: 3             num_slots: 2
slot: 0,             priority: 15,             suffix: _a,             retry_count: 0,             boot_successful: 1
slot: 1,             priority: 14,             suffix: _b,             retry_count: 7,             boot_successful: 1
nvidia@tegra-ubuntu:~$ sudo nvbootctrl get-current-slot
1

The device has booted in slot 1.The debug log is attached as “mts-bootpack_corrupted.log”

2.We tried the same by corrupting the kernel partition using “sudo dd if=examples.desktop of=/dev/mmcblk0p26”. This now the device failed to boot. The debug log is attached as “kernel_corrupted.log”

Are these steps enough to enable this device recovery feature? and is there an alternate way to test this?

Attachments:

kernel_corrupted.log (11.7 KB) mts-bootpack_corrupted.log (18.9 KB) normal_working.log (18.7 KB)

hello ashlinsurey.a,

may I know which JetPack release you’re working with?
had you enable secure boot feature for testing A/B redundancy?
thanks

hello Jerry,

We are testing this in Jetpack 4.2.2, L4T 32.2.1.

No, we have not enabled the secure boot feature. Could you please share the details/steps to enable this?

Thanks.

hello ashlinsurey.a,

there’s some fixes check-in to the latest release for A/B redundancy.
would you please moving to the latest JetPack release, i.e. JetPack-4.4 / l4t-r32.4.3 to have verification.
thanks

Hi Jerry,

Right now, it’s not possible to move our development to the latest version. Is there a way to apply those fixes to the L4T 32.2.1?

Thanks.

hello ashlinsurey.a,

I’ve check and found kernel partition was /dev/mmcblk0p28 and /dev/mmcblk0p29 of my TX2.
could you please check again you’d corrupt that partition as expected?

$ ls -al /dev/disk/by-partlabel
...
lrwxrwxrwx 1 root root  16  Sep  22 14:58 kernel -> ../../mmcblk0p28
lrwxrwxrwx 1 root root  16  Sep  22 14:58 kernel_b -> ../../mmcblk0p29

Hi Jerry,
I tried executing the same command in my setup. Here the kernel partition is /dev/mmcblk0p26.

nvidia@tegra-ubuntu:~$ ls -al /dev/disk/by-partlabel
total 0

lrwxrwxrwx 1 root root  16 Jan 28  2018 kernel -> ../../mmcblk0p26
lrwxrwxrwx 1 root root  16 Jan 28  2018 kernel_b -> ../../mmcblk0p27

I have attached the entire output of this command in a file.

part_lablel.log (2.2 KB)

hello ashlinsurey.a,

how about using r32.3.1, there’re also bootloader changes to correct A/B redundancy failures.

Hi ashlinsurey.a,

Is this still an issue to support? Any result can be shared?

Hi Kayccc

We tried the same with the new L4T 32.3.1 and L4T 32.4.3 and our evaluation works fine. We are trying to move our BSP to the new L4T versions. Thanks for your support.