Fail to validate the VarStore on UEFI

I’m working on AGX Orin 64GB with our custom carrier board on R35.5.0

I did power on-off test without shutdown in High temperature.
After several power off, it stopped booting with the following error.
teraterm.log (76.4 KB)

ASSERT [FvbNorFlashStandaloneMm] /dvs/git/dirty/git-master_linux/out/nvidia/optee.t234-uefi/StandaloneMmOptee_RELEASE/edk2-nvidia/Silicon/NVIDIA/Drivers/FvbNorFlashDxe/FvbNorFlashStandaloneMm.c(868): ((BOOLEAN)(0==1))

The source code is written as follows, indicating that Variable space is broken.

[edk2-nvidia/Silicon/NVIDIA/Drivers/FvbNorFlashDxe/FvbNorFlashStandaloneMm.c]
    /* We're here, which means there is a non-erased Variable Integrity space
     * that isn't matching our expected measurement.
     */
    ASSERT (FALSE);

This problem occurs only when multiple USB devices are plugged in at high temperatures.

My question is two.
1.Is it possible for Power off to corrupt the UEFI variable area?
I think there is no write access from ubuntu.
2.Do you have any good ideas to prevent this problem?

Hi S.Harumoto,

Please refer to Cannot boot after Reset Settings on UEFI - #10 by KevinFFF for the similar issue as yours, and wait for the next release for the fix.

Thanks for reply.
This Issue is caused by power off, but the link issue is caused by the Reset Setting button.
Are these really caused by the same problem?

Could you share the detailed steps how you reproduce this issue so that we can verify locally?

Here is my steps.

  1. Connect USB 3.0 Storage Device × 4.
  2. Sets CPUModule to a high temperature(CPU Temp: 90°C with tegrastats).
  3. Power On Jetson.
  4. Wait Ubuntu starts.
  5. Power off without shutdown.

Repeat step 3~5 about 1000~1500 times. Then the issue will occur.

I do not know if it will be reproduced in the developer kit.

How did you perform this? Do you use any external device to heat it up?

Are you running these manually? Or with any script and testing 1000 to 1500 times?
It seems a low reproduce rate.

Do you have the devkit to reproduce the same issue?

Do you use any external device to heat it up?

Yes, I use Environmental Test Chambers.

Are you running these manually? Or with any script and testing 1000 to 1500 times?

No, I use programmable timer to power on/off.

Do you have the devkit to reproduce the same issue?

Yes, I am trying to reproduce with devkit.

Okay, so it seems hard for us to verify locally.

Do you power the board and cut off the power from external?

Can this state be recovered in next boot?
(i.e. does reset help to recover?)

Do you power the board and cut off the power from external?

Yes, I do.

Can this state be recovered in next boot?

No, I need reflash to recover.

So, it seems something in your test causes UEFI variable changes.
Please help to check if you can reproduce on the devkit or other simple method to reproduce the same issue.

1 Like

I will tell you if I know how to reproduce it with devkit.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.