Unhandled Exception in EL3 error after AGX Xavier reset

Ok, so is there any issue in reporting this to the board vendor first? I think the board is from ConnectTech, right?

1 Like

We design and build the carrier board. Our company is Cornet Technology. Inc. Now i am checking the power up and power down sequence and we are using power button supervisor MCU with auto power on case.

Ok. So you are the board vendor. Please clarify this directly when you file topic.

If this is custom board , then did you remember to set cvb eeprom read size to 0 in the BCT cfg file?

Also, since the error is in UEFI, could you build the UEFT debug build to enable full log?

Hello I am with Cornet working on the same project as klaml3o9. I am trying to build the UEFI on linux Ubuntu system and I get the following errors pasted at the end.

I following the guidelines to install mono and updated the images but I still get the error. Could you help me get over this hump . Do you think it is better to compile the UEFI on windows since the NuGet tool is inherently windows.

Thanks

SECTION - Initial update of environment
UpdatingWARNING - [SDE] Failed to fetch NugetDependecy: edk2-acpica-iasl@20200717.0.0: [Nuget] We failed to install this version 20200717.0.0 of edk2-acpica-iasl
WARNING - [SDE] Failed to fetch NugetDependecy: mu_nasm@2.15.05: [Nuget] We failed to install this version 2.15.05 of mu_nasm
. Done
SECTION - Second pass update of environment
UpdatingWARNING - [SDE] Failed to fetch NugetDependecy: mu_nasm@2.15.05: [Nuget] We failed to install this version 2.15.05 of mu_nasm
.WARNING - [SDE] Failed to fetch NugetDependecy: edk2-acpica-iasl@20200717.0.0: [Nuget] We failed to install this version 20200717.0.0 of edk2-acpica-iasl
. Done
ERROR - We were unable to successfully update 2 dependencies in environment
SECTION - Summary

What is your step to build UEFI? Did you follow the guidance from our public source code tarball?

Here is the debug version I got

uefi_Jetson_DEBUG.bin (2.7 MB)

To replace Linux_for_Tegra/bootloader/uefi_jetson.bin and re-flash.

Yes, from edk2-pytool-extensions/using_linux.md at master · tianocore/edk2-pytool-extensions · GitHub was one of them with suggestions on compilation on linux.

Once the crash does occur, one way to recover quickly without having to reflash the whole thing is to just flash BCT

sudo ./flash.sh -r -k BCT jetson-agx-xavier-industrial-cti mmcblk0p1

This seems to fix it until the next crash.

Using the debug uefi from user10090 these are the logs. It seems to be in some kind of loop with errrors. Not sure what these errors are.

logs_with_debug_uefi_version (27.8 KB)

Hi,

I feel you are hitting different issue with the initial issue reported. For example, you don’t even enter UEFI now.

Could you monitor your board uart log after it gets flash with pure image and see how to reproduce this error or any specific error log before this problem happened?

xavier_uefi_debug_output_with_crash (13.5 KB)

Instead of just uploading the debug uefi using the flash command. I reflash the whole system.img which included the debug uefi. After it reflashed and I power cycled it, the logs on the serial output are stored in the file.

Post the crash stage. I ran the command
sudo ./flash.sh -r -k BCT jetson-agx-xavier-industrial-cti mmcblk0p1

and the system will come up. This time the UEFI spits out a lot of debug information
with
PROGRESS CODE: V03051005 I0
PROGRESS CODE: V03050000 I0
and
Deleting fragment fragment@0
Deleting fragment fragment@1
etc…

So it looks like the UEFI debug version does work (I had tried to just flash the uefi and that did not seem to work).

From the logs it looks like what you had deduced before i.e it is not hitting the UEFI is correct. What else could it be ? How do we approach this issue. This seems to happen very consistently if we power cycle.

Thank you,

Hi,

Is there a consistent crash point in the log?

I mean, the log crashed in UEFI in the earliest log, later it crashed even not entering UEFI, and the log you just shared again crashed in UEFI.

Is the log you just shared already enabled UEFI debug log?

The current logs are the correct one with UEFI debug enabled (that is using the uefi image from user100090). I can confirm that it is a debug uefi is because I see a lot of extra debug messages with the UEFI when it does come up correctly. After I power cycle it goes into the crash mode which is there in the logs. I do not see any of the debug message which tells me that the UEFI has not been entered.

Is it possible to test that module on devkit and see if same test can lead to same failure or not?

If it is not reproducible, then I can only ask some hardware folks to provide suggestion here.

Unfortunately we do not have a devkit for the xavier. We only have one for orin (which will be custom made in the future). The custom boards we have are all xavier. Would you be able to request the hardware folks to see if they have any suggestions.

The other question we had was when the flash.sh is run with partition BCT what does it do and why running that seems to fix the situation. Just running the flash.sh for BCT partition (which is much quicker than the whole image) seems to fix the crash issue. Is there any correlation between power outage and the BCT area loosing information ?

Thanks,

Hi,

Also want to know, if you use same board/module with jetpack4, will you hit this issue?

I will download and install jetpack 4.6.1 and give it a try.

Thanks,

jetpack_4_logs (4.9 KB)
I dowloaded jetpack 4.5.1 and tried it.

The logs are attached. It hangs in there saying HALT: spinning forever.

There is no jetson-agx-xavier-industrial so I used jetson-agx-xavier-devkit conf on our custom, not sure if that is what is messing it up.

But after a powercycle when it is in the HALT:spinning forever stage , it goes back to the old UEFI
Jetson UEFI firmware (version r34.1-975eef6 built on 2022-05-16T20:58:45-07:00)
ception in EL3.R 0x80000000: exception reason=0 syndromeUnxbndl0000
x30Unhandled Exception in EL3.
x30

I don’t think the jetpack 4.5.1 flash flashed into the mmcblk0p1.

Thanks,

Hi,

AGX industrial added to support in jp4.6… so you should at least use version >=jp4.6…

I tried 4.6.1 sdkmanager and it downloads sdkmanager_1.8.1-10392_amd64.deb which once installed goes to 5.0.1 version.

The sdkmanager which seems to work for is sdkmanager_1.6.1-8175_amd64.deb.

From the list which one would you suggest I use

sdkmanager_1.8.1:
deb | rpm | Docker image 18.04 | Docker image 20.04
sdkmanager_1.8.0:
deb | rpm | Docker image 18.04 | Docker image 20.04
sdkmanager_1.7.3:
deb | rpm | Docker image
sdkmanager_1.7.2:
deb | rpm | Docker image
sdkmanager_1.7.1:
deb | rpm | Docker image
sdkmanager_1.7:
deb | rpm | Docker image
sdkmanager_1.6.1:

Thanks,

Hi,

You don’t need to downgrade “sdkmanager version”. Sdkmanager 1.8.1 can also install jp4.x.

But the problem is jp4.x can only be flashed by ubuntu 16.04 and 18.04. Thus, if you see sdkmanager have no option for jp4, then it probably means your host version problem.

BTW, why are you still asking me how to flash jp4? Is this your first time doing bring up for custom board and you directly try with jp5 only?