Jetson Orion NX not booting

Hi,

Jetson Orin NX unit is not booting all of the sudden.

Please find the GPU console log attached for your information.

Please let us know the root cause for this booting issue.
I observed the boot stage enter MB2 and after some msgs there is not further messages in the log

Thanks
Nagesh R

GPUteraterm.log.txt (36.0 KB)

Thanks for the update.

One thing is not clear here.
When we tested the unit at our place it worked well and fine(we did not apply any UEFI patch), but at the customer place it is giving this error they are reporting.

Currently we don’t have access to the unit.
Customer is not aware of reflashing or updating this patch.

Any idea how unit booted and worked fine here and giving this UEFI assertion error at customer place suddenly?

Thanks for your thoughts in advance.

Someone hit the issue in reboot stress test, someone hit the issue with specific steps.
This assertion may be relating to memory leak of UEFI variable.
You can also refer to Assertion issue in UEFI during boot - Jetson & Embedded Systems / Jetson AGX Orin - NVIDIA Developer Forums for the details.

Thanks a lot for the updates.

In that link, I found, the assertion we are facing is assertion no:3

It says :

Has not been fixed in any release, but the next
release of JP5 and JP6 will include it.

  1. We have flashed version 36.3 on jetxon Orin NX unit so that mean it has been fixed in version 36.4?

  2. I see that there is separate flash command to flash only QSPI, but we use a full command which does Internal and external NVMe flashing.

Does over all flashing command need to be changed to reflect this QSPI flashing?

Please explain as I am new to this UEFI code download building etc on host PC

Please provide the steps in sequence to be followed to bring back the unit up now…

Thanks

No, it has not been fixed in both JP6.0(r36.3.0) and JP6.1(r36.4.0).

You can just perform QSPI flash only to apply the change.

Please refer to Build with docker · NVIDIA/edk2-nvidia Wiki to setup and build UEFI binary.

@KevinFFF

Ok. Thanks. Any idea when r36.5.0 will be released where this issue is fixed?

ok.

ok.

I have below doubts. As we dont have unit with us to try this UEFi patch build and installation, we are trying to provide exact steps to customer to fix the issue on this side. hence asking these questions.

  1. In one link I see the build.sh steps saying
    edk2_docker edk2-nvidia/Platform/NVIDIA/Jetson/build.sh

in another link it says:
$ edk2_docker edk2-nvidia/Platform/NVIDIA/StandaloneMmOptee/build.sh

which is the correct path.

  1. In the below link, i see this step
    Assertion issue in UEFI during boot - #8 by KevinFFF

$ export UEFI_STMM_PATH=/images/uefi_StandaloneMmOptee_RELEASE.bin

But I dont see this step in the link:

  1. In the below link it says:

cd $EDK2_BUILD_ROOT/edk2-nvidia
git apply <PATH_TO_DIFF_PATCH>

just want to know what is the correct path to “<PATH_TO_DIFF_PATCH>”

Thanks. Kindly request to reply as soon as possible, as we have to fix this issue at customer site today.

Hi:
Here is the patch by Wayne:

Thanks.

sorry, i know just some basic of git hub.
So asking this question.
If we use below command then will the patch get updated:

git apply e96433c.diff

Hi:
Yes. Run git apply <PATCH> will apply the patch.
You should be able to see the changes in the commit in Here

The step I shared is to re-build UEFI bin. However, it will not fix the issue, Ref:

So, you might still need to follow another steps to build the optee image and re-flash onto NX.

Thanks for the help.

You mean the changes will committed in the github repository as well?

Its very confusing as we dont have time to read every thing.
I was reading the docker method of build UEFI and applying the patch and reflashing

Could you provide the full steps involving this optee image also which works fine.
As we dont have unit with us, and customer is not technically sound, we are finding very difficult.

However, I have documented below steps, pls see, if they are correct.

Actual Steps to follow:

Setup the UEFI workspace directory:

cd ~

mkdir jetsonUefi

cd jetsonUefi

mkdir uefiWorkspace

nano jp6UefiSetup.sh

source jp6UefiSetup.sh every time you start a new session:

jp6UefiSetup.sh:( contents of jp6UefiSetup.sh should as shown below )

#!/bin/bash

# Point to the Ubuntu-22 dev image

export EDK2_DEV_IMAGE=“Package containers/ubuntu-22-dev · GitHub”

# Required

export EDK2_USER_ARGS=“-v "${HOME}":"${HOME}" -e EDK2_DOCKER_USER_HOME="${HOME}"”

# Required, unless you want to build in your home directory.

# Change “/build” to be a suitable build root on your system.

export EDK2_BUILD_ROOT=pwd/uefiWorkspace

export EDK2_BUILDROOT_ARGS=“-v "${EDK2_BUILD_ROOT}":"${EDK2_BUILD_ROOT}"”

# Create the alias

alias edk2_docker=“docker run -it --rm -w "$(pwd)" ${EDK2_BUILDROOT_ARGS} ${EDK2_USER_ARGS} "${EDK2_DEV_IMAGE}"”

Apply the settings:

source jp6UefiSetup.sh

Add the repo and clone it:

edk2_docker init_edkrepo_conf

edk2_docker edkrepo manifest-repos add nvidia GitHub - NVIDIA/edk2-edkrepo-manifest: NVIDIA fork of tianocore/edk2-edkrepo-manifest main nvidia

# Clone

edk2_docker edkrepo clone $EDK2_BUILD_ROOT NVIDIA-Platforms main

sudo chown -hR $USER ./*

Apply the diff patch:

cd $EDK2_BUILD_ROOT/edk2-nvidia

git apply Varint readfix r35.5.0 by gmahadevan · Pull Request #110 · NVIDIA/edk2-nvidia · GitHub

Note: if the above “git apply” command does not work, do the code changes manually.

Build Jetson UEFI:

$ export UEFI_STMM_PATH=/images/uefi_StandaloneMmOptee_RELEASE.bin

cd $EDK2_BUILD_ROOT

edk2_docker edk2-nvidia/Platform/NVIDIA/Jetson/build.sh

Replace the image in Jetpack UEFI, where $JETPACK is point to your Linux_for_Tegra folder:

sudo mv $JETPACK/bootloader/uefi_jetson.bin $JETPACK/bootloader/BACK_UP_uefi_jetson.bin

sudo cp $EDK2_BUILD_ROOT/images/uefi_Jetson_RELEASE.bin $JETPACK/bootloader/uefi_jetson.bin

Flash bootloader only without overwriting the APP partition:

sudo ./flash.sh -c bootloader/generic/cfg/flash_t234_qspi.xml jetson-orin-nano-devkit internal

Verification steps

Step 1: Enter UEFI Menu.

Step 2: Select Device Manager → NVIDIA Configuration → Reset Setting.

Step 3: Go back top menu and press Reset to Exit.

Step 4: Check if it can boot with successful

Repeat Step 1 to 4 about 5 times to check if there’s assertion issue.

Hi:
About the git things, git apply patch will change your code like the commit shown. After apply it, you still need to git commit and git push to push them to github, or gitlab, depends on your git server.

And, about the optee image, I haven’t read the documents about it too. I shared the steps because I thought it’s an UEFI issue, not an optee image issue. So, I’ll leave it to @KevinFFF.

Thanks.

Thanks.

@KevinFFF

I kindly request you to provide a complete organised , sequential and documented steps so that some one with little technical knowledge also can execute one by one and should be able to reflash QSPI and fix the assertion problem.

Thanks

All required steps should be included in Assertion issue in UEFI during boot - Jetson & Embedded Systems / Jetson AGX Orin - NVIDIA Developer Forums.

I have to mention again that the fix is included in tos image so you have to build stmm image(i.e. build uefi_StandaloneMmOptee_RELEASE.bin through $ edk2_docker edk2-nvidia/Platform/NVIDIA/StandaloneMmOptee/build.sh) first and refer to each steps in atf_and_optee_README.txt to build optee for tos image.

Sorry for the inconvenience about this issue so that I created a topic to address this issue.
I know the steps in atf_and_optee_README.txt is a little complicated but it is necessary to apply the fix patch.
If you don’t want to do that, please wait for the next release.
I’ve checked this with internal that the next release may be on early Janunary. (should be R36.4.2)

1 Like

Thanks for the updates.

Have some queries, please help us. As we are helping customer to fix the issue, remotely and they are not technically strong.

  1. I was thinking using this backup restore command, but later realized this wont do the job, as it just replaces NVMe data, where as we have issue with QSPI corruption. Please confirm, If I am correct.

$ sudo ./tools/backup_restore/l4t_backup_restore.sh -e nvme0n1 -r jetson-orin-nano-devkit

  1. Secondly, we have provided download and build steps for complete flash( both QSPI + NVMe) , but he is getting errors.
    My question is what , if ask him to execute just normal QSPI flash once again, without the UEFI patch and try seeing, if it boots. If this feasible? let us know.

sudo ./flash.sh -c bootloader/generic/cfg/flash_t234_qspi.xml jetson-orin-nano-devkit internal

  1. In the atf_and_optee_README.txt. It says:

This package contains the necessary files and instructions to build a
trusted OS image based on ATF and OP-TEE for these Jetson devices:

  • Jetson AGX Orin series

Here we are working on Jetson Orin NX 16 GB module. Are these steps applicable for Orin NX also?

Thanks.

This command would work to flash QSPI only.(tos image is included in partition of QSPI) You don’t need to run backup/restore script.

It should be also applicable for Orin NX/Nano.

You can generate the tos image for your customer to fix this issue.

Thanks.

Today I will be trying out tos image generation and in case of any errors or issues, will be reporting here.

Yesterday I was able to generate the uefi_StandaloneMmOptee_RELEASE.bin file from UEFI source code build through docker method.

I was finally able to build tos.img file successfully using the command:

./nv_public_src_build_tos.sh -p t234 -u /home/trident/Downloads/r36_3_0/UEFI_Docker_Build/uefiWorkspace/images/uefi_StandaloneMmOptee_RELEASE.bin -s /home/trident/Downloads/r36_3_0/Linux_for_Tegra/nv_tegra/tos-scripts/gen_tos_part_img.py

But I have few doubts to be clarified, @KevinFFF pls clarify them.

  1. The UEFI sources which i downloaded through docker method, already had the patch applied, but I see step for applying patch in your procedure. why?

  2. In the below link:
    OP-TEE setup questions for the atf_and_optee_README.txt - #5 by JerryChang

I see some additional steps which is missing in your procedure:

$ export CROSS_COMPILE=/home/jerry/L4T/l4t-gcc/aarch64–glibc–stable-2022.08-1/bin/aarch64-buildroot-linux-gnu-
$ export NV_TARGET_BOARD=t234
$ sudo apt remove python3-cryptography
$ pip3 install cryptography
$ ./nv_public_src_build_tos.sh -p t234 -u $OUT/JP-6/Linux_for_Tegra/bootloader/standalonemm_optee_t234.bin -s $OUT/JP-6/Linux_for_Tegra/nv_tegra/tos-scripts/gen_tos_part_img.py

  1. In your procedure the UEFI_STMM_PATH is different, in this link it is different. pls clarify.

$ export UEFI_STMM_PATH=$OUT/JP_6/Linux_for_Tegra/bootloader/standalonemm_optee_t234.bin

  1. We are getting the unit back for reflashing, I will flash this tos.img and see, if it resolves the ASSERtion 3 problem.

Thanks.

Which UEFI branch you are using? (i.e. please share the command how did you clone the source)

These steps are included in atf_and_optee_README.txt, which I’ve mentioned in Step3.

You didn’t export UEFI_STMM_PATH here.

What do you mean about “different”?
Please just export UEFI_STMM_PATH with the stmm image you built from UEFI source.

I referred the below link:

and cloned using the below command:

edk2_docker init_edkrepo_conf
edk2_docker edkrepo manifest-repos add nvidia GitHub - NVIDIA/edk2-edkrepo-manifest: NVIDIA fork of tianocore/edk2-edkrepo-manifest main nvidia

Clone

edk2_docker edkrepo clone $EDK2_BUILD_ROOT NVIDIA-Platforms main
sudo chown -hR $USER ./*

I dont see these below specific commands in the file atf_and_optee_README.txt.
I have attached the file atf_and_optee_README.txt with this post for your reference. You can cross check.
Note: I executed these below additional commands, apart from what commands are there in atf_and_optee_README.txt already.

$ export CROSS_COMPILE=/home/jerry/L4T/l4t-gcc/aarch64–glibc–stable-2022.08-1/bin/aarch64-buildroot-linux-gnu-
$ export NV_TARGET_BOARD=t234
$ sudo apt remove python3-cryptography
$ pip3 install cryptography
$ ./nv_public_src_build_tos.sh -p t234 -u $OUT/JP-6/Linux_for_Tegra/bootloader/standalonemm_optee_t234.bin -s $OUT/JP-6/Linux_for_Tegra/nv_tegra/tos-scripts/gen_tos_part_img.py

Sorry for the confusion. I exported this command as per your recommendation in the thread below. Also my doubt is, I executed some other additional commands which are not documented in the readme txt file.

I will explain more clearly. As per your recommendation in the below thread:
we should export from the UEFI STMM Path from the UEFI sources path:

$ export UEFI_STMM_PATH=/images/uefi_StandaloneMmOptee_RELEASE.bin

where as in atf_and_Optee_README.txt file it says the UEFI STMM Image should be exported from the path:
<Linux_for_Tegra>/bootloader/standalonemm_optee_t234.bin

why is this difference?


Also this command given in the readm.txt file
nv_public_src_build.sh will only build the ATF source successfully, but give error to build OPTEE source.