Enabling PartitionA/B redundancy on Jetson Xavier AGX Devkit on R35.3.1

Hello,
I am trying to enable Partition A/B redundancy on Jetson Xavier AGX devkit which runs on Jetpack 35.3.1
I followed the documentation and other threads here and this is where currently i am:

  1. $ sudo ROOTFS_AB=1 ROOTFS_RETRY_COUNT_MAX=3 ./flash.sh [options] <target_board>
    Executed the above command and it did create APP and APP_b for rootfs as expected. Size of the APP and APP_b along with other partitions.

  2. however, if i delete the /lib with rm -rf from Partition 0 that i am booting with, then on reboot, i end up in an error.
    The jetson is not able to boot from Partition APP_b is what i expect? How to enable this?
    I have read threads about a defect in Jetson R35.2 onwards. But need more help here.

Thanks!

Hi smuthnephewdeveloper,

Could you share the full serial console log when you are doing this test?
What error do you hit? Is that kernel panic?

Hi, did you manage to find a solution for this?

Hello Kevin,
Sorry for the delayed response. Anywho, please find the logs attached.
PuttyLogs8Sep2023.txt (94.1 KB)

Yes, i hit the kernel panic and its takes a couple of minutes before it tries to boot from the other partitions. But ends at the following (this is also the end of the log file attached)
Jetson UEFI firmware (version 3.1-32827747 built on 2023-03-19T14:56:32+00:00)
ESC to enter Setup.
F11 to enter Boot Manager Menu.
Enter to continue boot.
Rebooting to new boot c

After this, there is no activity on the terminal for next 15minutes and the Jetson Xavier devkit is in the Force Recovery mode.

Any thoughts?

Fyi. My Xavier Devkit has PublicKeyHash, SBK, KEK0, KEK1 fuses burnt along with production bit set to 1. Not sure if this has anything to do with the above observations.

Not yet. Just responded to KevinFFF above with more details.

The log here is in UEFI.

I’ve tried to reproduce the issue on the devkit with the same step as yours, but w/o hit the issue.
It could switch and boot from another slot after boot 3 times failed due to kernel panic.

From your log, it seems only getting one time kernel panic.
Could you help to reproduce again to check if you would hit the same issue?
Or please verify with the latest R35.4.1.

You could also switch to slot B and switch back to A, then do the experiment, so that we can know the switching chain function works well before corrupting the slot A.

Hello Kevin,
I will add more details.

  1. I am observing only 1 kernel panic because i have set MAX_RETRY_COUNT = 1. So that makes sense. Should that be an issue?
  2. Does your kit have Public Key Hash, SBK etc other fuses set? And also the production fuse enabled? I am asking this because i have another kit where in i haven’t burnt fuses and that works correctly. however, this one with fuses written to does not. Any thoughts?

It should be alright, just because you said 3 in original post…

No, I just refer to the exact the same procedure as yours w/o any fuse or key enabled. Maybe we could focus on that if it is caused from fused device.
Please share the detailed reproduce steps how you setup the board.

Please find the fuse.xml that i used attached below.
Fuse_Config_Xavier.xml (2.8 KB)

Hi,
By refering to Root File System — Jetson Linux Developer Guide documentation, I think the Xavier should automatic swith to another bootable chain if it encounter a boot failure.
But we encountered the same isuue in r35.4.1 with following steps.

I try to use a Xavier NX with following command to flash it with A/B partition images.
sudo ROOTFS_AB=1 ROOTFS_RETRY_COUNT_MAX=1 ./flash.sh jetson-xavier-nx-devkit internal
After that, I can switch the boot chain by using UEFI or nvbootctrl.
2.
After boot it up , I remove /lib of current rootfs.
Then I turn power off/on to reboot it.
3
The Xavier NX boot from chain 0 again with “end Kernel panic …” message.
4.
After 10 minutes wait, I manually power off/on to reboot it again.
My Xavier NX boot from chain 0 agin and got the same " end Kernel painc " message.

I try the step4 for more 3 times and got the same resule.

Why it always boot from chain 0, but switch to another bootable chain when it encountered a boot failure?

Why Xavier NX does not reboot automatic after it encountered a Kernel panic ? Do we need to do more for it?

boot_log_AB.txt (199.0 KB)

Jimmy

@KevinFFF provided the Fuse.xml file on Sep13 (I think the message might have been missed). Let me know how did that work for you.

From your serial console log, it seems your watchdog doesn’t be configured correctly. Please check the device tree for watchdog.

Please provide the detailed steps how you configure the board with fuse.

I’m closing this topic due to there is no update from you for a period, assuming this issue was resolved.
If still need the support, please open a new topic. Thanks

Is this still an issue to support? Any result can be shared? Thanks