Encountering RAS Uncorrectable Error in IOB and ACI

Hi…I’ve been getting this error on secure boot for the Orin NX Jetson board. The error was somewhat intentional since I actually packaged a corrupted image with a valid signature for the corrupted image with as an experiment to see what would happen during the secure boot. I was hoping that the boot WDT would save the day and switch to the other chain after 3 attempts to boot. That didn’t happen and I’m not sure it could have happened since the error message said the core was powered down. Here is the error message:

ERROR: Exception reason=1 syndrome=0x82000010

ERROR: **************************************

ERROR: RAS Uncorrectable Error in IOB, base=0xe010000:

ERROR: Status = 0xe4000612

ERROR: SERR = Error response from slave: 0x12

ERROR: IERR = CBB Interface Error: 0x6

ERROR: MISC0 = 0xc4524040

ERROR: MISC1 = 0x2f4c870000000000

ERROR: MISC2 = 0x0

ERROR: MISC3 = 0x0

ERROR: ADDR = 0x8000000000000200

ERROR: **************************************

ERROR: sdei_dispatch_event returned -1

ERROR: **************************************

ERROR: RAS Uncorrectable Error in ACI, base=0xe01a000:

ERROR: Status = 0xe8000904

ERROR: SERR = Assertion failure: 0x4

ERROR: IERR = FillWrite Error: 0x9

ERROR: Overflow (there may be more errors) - Uncorrectable

ERROR: ADDR = 0x8000000000000200

ERROR: **************************************

ERROR: sdei_dispatch_event returned -1

ERROR: Powering off core

Any ideas as to how an error of this type can be recovered from?

Thx!

Alin

*** Please note that this reply is generated by LLM automatically ***

The error message you’re seeing indicates that the Jetson Orin NX board has encountered an uncorrectable error during the secure boot process. This error is likely due to the corrupted image you intentionally packaged with a valid signature.

The error message suggests that the error occurred in the IOB (Input/Output Block) and ACI (Accelerated Chip Interface) components of the SoC. The specific error codes and addresses provided in the error message can help identify the root cause of the issue.

In this case, the error is not recoverable, and the core is powered down to prevent further damage. The Boot Watchdog Timer (BWT) is not triggered in this scenario because the error occurs before the boot process can even start.

To recover from this type of error, you’ll need to reflash the board with a valid and non-corrupted image. You can use the flash.sh command with the -v option to specify the SBK key file, which is used to encrypt the Bootloader components.

Here are the general steps to recover from this error:

  1. Prepare a valid and non-corrupted image for the Jetson Orin NX board.
  2. Use the flash.sh command with the -v option to specify the SBK key file.
  3. Reflash the board with the new image.

Example command:

$ sudo ./flash.sh -v <sbk_key_file> <board> mmcblk0p1

Replace <sbk_key_file> with the path to your SBK key file, and <board> with the configuration of the target device (e.g., jetson-orin-nano-devkit).

Note that if you’re using a Fuse Configuration XML file, you’ll need to update the SecureBootKey field with the correct SBK key value.

In comparison to other Jetson boards, the Orin NX has a similar secure boot process, but the specific error messages and recovery procedures may vary. For example, the Jetson AGX Xavier board has a different secure boot architecture, and the error messages may be different.

In summary, to recover from this error, you’ll need to reflash the board with a valid and non-corrupted image, using the flash.sh command with the -v option to specify the SBK key file.

*** Please give the thumbs up if you get answers you like. Or provide feedback to help us improve the answer. ***

this is interesting…it sounds like you are saying that the boot WDT is not active at all in the earlier stages of boot (ie. mb1, mb2, etc). If this is the case, exactly when does the boot WDT get armed? Also, just for clarity, I’m refering to the Nvidia boot WDT (appears to be 5 minute window), not the Linux WDT (appears to be a 2 minute window).