Jetson orin nx failed to bootup after UEFI updated

Hello!
I encountered an issue where the device occasionally did not start properly after restarting. The serial port log is as follows:


I see the forum has the same problem:

I then updated UEFI as follows:

  • Download and decompressedk2-nvidia-202406.0.tar.gz
  • copy file uefi_Jetson_RELEASE.bin to Linux_for_Tegra/bootloader/uefi_jetson.bin
  • Flash QSPI: ./flash.sh -c bootloader/generic/cfg/flash_t234_qspi.xml $BOARD nvme0n1p1
    Then the device fails to start. The serial port log is as follows:
ERROR:   RAS Uncorrectable Error in CCPMU, base=0xe001000:
ERROR:   	Status = 0xe4000504
ERROR:   SERR = Assertion failure: 0x4
ERROR:   	IERR = uCode Error: 0x5
ERROR:   	MISC0 = 0x0
ERROR:   	MISC1 = 0x0
ERROR:   	MISC2 = 0x0
ERROR:   	MISC3 = 0x0
ERROR:   	ADDR = 0x60a5a5a5a5a5a5a5
ERROR:   **************************************
ERROR:   sdei_dispatch_event returned -1
ERROR:   Powering off core
ERROR:   ARI request timed out: req 34
ASSERT: plat/nvidia/tegra/soc/t234/drivers/mce/ari.c:154

Forums have the same problem:Jetson Orin Nano Developer board: failed to bootup after UEFI updated - #6 by stevenc
But I can’t update the SDK, I just need to update UEFI, what should I do?

  • The hardware uses the Jetson orin nx module and custom carrier, and the SDK is Jetson Linux 36.3

@ WayneWWW
Hello, I saw your reply, can you tell me which commit modified this problem? I just need to add this patch to the source code of R36.3.0

Hi,

Could you directly try 36.4 instead doing patching?

Because I have made a lot of modifications on 36.2, it is more troublesome to use 36.4
Is this it? PR to fix ASSERTS during boot and resetting the menu. by ashishsingha · Pull Request #103 · NVIDIA/edk2-nvidia · GitHub

@ WayneWWW
Hi, I downloaded the SDK for R36.4.0, copied Linux_for_Tegra/bootloader/uefi_jetson.bininto my current directory, and then re-burned QSPI, but this problem still occurs
This is the serial port printed version information:
Jetson UEFI firmware (version 36.4.0-gcid-37537400 built on 2024-09-13T04:02:39+00:00)
This is the serial port log of the error:

I/TC: Reserved shared memory is disabled
I/TC: Dynamic shared memory is enabled
I/TC: Normal World virtualization support is disabled
I/TC: Asynchronous notifications are disabled
I/TC: WARNING: Test UEFI variable auth key is being used !
I/TC: WARNING: UEFI variable protection is not fully enabled !

ASSERT [FvbNorFlashStandaloneMm] /out/nvidia/optee.t234-uefi/StandaloneMmOptee_RELEASE/edk2-nvidia/Silicon/NVIDIA/Drivers/FvbNorFlashDxe/FvbNorFlashStandaloneMm.c(937): ((BOOLEAN)(0==1))

Can you look at it again for me?

Hi @qiaowei

Just to clarify, did you do a full flash or only flash the bootloader?

Could you try to upgrade the whole BSP but not just upgrade UEFI?

I copied uefi_jetson.bin to R36.3 and then full flash

Can I just update BSP. and leave rootfs, ota_tools, and public_sources unchanged, since I made a lot of changes
Or Can I modify this to solve this problem?

Let me change my comment directly. Could you flash your whole board with jetpack6.1 BSP but not doing what you are doing now?

Such replacing partial things are not going to work well. For example, we don’t support to use rel-36 UEFI to boot rel-35 rootfs.

Hello, I saw that other users also had this problem with R36.4.0:

Yes, we notice that report and still trying to reproduce. Seems a random one and not easy to get hit.

Hello, I see there are ways to circumvent this issue:Updating TOS in Jetpack 6 - #5 by KevinFFF
However, we encountered some problems while modifying tos-optee:
I checked out the documentation:Build without docker · NVIDIA/edk2-nvidia Wiki · GitHub
Modified the PcdAssertOnVarStoreIntegrityCheckFail to FALSE,I then compiled uefi_jetson.bin and standalonemm_optee_t234.bin
I used the sources from https://nv-tegra.nvidia.com/r/admin/repos/q/filter:optee-src to build both the atf (for bl31.bin) as well as nv-optee, both on the jetson_36.3 branch
Then update A_cpu-bootloader and A_secure-os, but will crash after startup:

I> Task: Bootchain failure check
I> Current Boot-Chain Slot: 0
I> BR-BCT Boot-Chain is 0, and status is 1. Set UPDATE_BRBCT bit to 0
I> Task: Burn RESERVED_ODM0 fuse
I> Task: Lock fusing
I> Task: Clear dec source key
I> MB2 finished

?OTICE:  BL31: v2.8(release):V2.0.7-6-gf463dc0-dirty
NOTICE:  BL31: Built : 11:11:36, Nov 14 2024
I/TC: 
I/TC: Non-secure external DT found
I/TC: OP-TEE version: 3.22 (gcc version 11.3.0 (Buildroot 2022.08)) #2 Thu Nov 14 06:07:55 UTC 2024 aarch64
I/TC: WARNING: This OP-TEE configuration might be insecure!
I/TC: WARNING: Please check https://optee.readthedocs.io/en/latest/architecture/porting_guidelines.html
I/TC: Primary CPU initializing
I/TC: Test OEM keys are being used. This is insecure for shipping products!
I/TC: Primary CPU switching to normal world boot
Unhandled Exception from EL1
x0             = 0xbe079bff9347c86a
x1             = 0xd2ef6f3ad7dac5cf
x2             = 0x00000000000f4240
x3             = 0x0000000081000000
x4             = 0x0000000000000001
x5             = 0x00000000be1ead38
x6             = 0xffffffffffffffff
x7             = 0x00000000be261ef0
x8             = 0x0000000000000020

@WayneWWW Anything else that I could try?

Thank you

@KevinFFF Could you please help me with this problem?

I see that the problem with this ASSERT is due to variable integrity check failures, possibly caused by write operations. Can I get around this problem by mounting efivars as read-only?

This commit will fix this issue.