UEFI boot failed : RAS Uncorrectable Error in SCC

Hi Nvidia,

I have a Orin NX 16GB module on a customized carrier board.
It’s installed the L4T35.2.1 on the system and work fine.

We tried to test the stability of system.
When we reboot the system many times. (maybe 300~1000 times or more)
I got following error message on UEFI and boot failed.

I> MB2 finished

▒▒NOTICE:  BL31: v2.6(release):6363e7382
NOTICE:  BL31: Built : 15:09:30, Jan 24 2023
ERROR:   **************************************
ERROR:   RAS Uncorrectable Error in SCC, base=0xe019000:
ERROR:          Status = 0xec00090d
ERROR:   SERR = Illegal address (software fault): 0xd
ERROR:          IERR = Address Range Error: 0x9
ERROR:          Overflow (there may be more errors) - Uncorrectable
ERROR:          MISC0 = 0xe000000
ERROR:          MISC1 = 0x4e991
ERROR:          MISC2 = 0x0
ERROR:          MISC3 = 0x0
ERROR:          ADDR = 0x80000041cb30f040
ERROR:   **************************************
ERROR:   sdei_dispatch_event returned -1
/TC:
I/TC: Non-secure external DT found
I/TC: OP-TEE version: 3.19 (gcc version 9.3.0 (Buildroot 2020.08)) #2 Tue Jan 24 23:20:42 UTC 2023 aarch64
I/TC: WARNING: This OP-TEE configuration might be insecure!
I/TC: WARNING: Please check https://optee.readthedocs.io/en/latest/architecture/porting_guidelines.html
I/TC: Primary CPU initializing
I/TC: WARNING: Test OEM keys are being used!
I/TC: This is only for TZ-SE testing and should NOT be used for a shipping product!
I/TC: Primary CPU switching to normal world boot
▒▒
  Jetson UEFI firmware (version 2.1-32413640 built on 2023-01-24T23:12:27+00:00)

Jetson UEFI firmware (version 2.1-32413640 built on 2023-01-24T23:12:27+00:00)
ESC   to enter Setup.
F11   to enter Boot Manager Menu.
Enter to continue boot.
**  WARNING: Test Key is used.  **
......
      L4TLauncher: Attempting Direct Boot
EFI stub: Booting Linux Kernel...
EFI stub: Using DTB from configuration table
EFI stub: Loaded initrd from LINUX_EFI_INITRD_MEDIA_GUID device path
EFI stub: Exiting boot services and installing virtual address map...
▒▒ERROR:   **************************************
ERROR:   RAS Uncorrectable Error in SCC, base=0xe018000:
ERROR:          Status = 0xec00090d
ERROR:   SERR = Illegal address (software fault): 0xd
ERROR:          IERR = Address Range Error: 0x9
ERROR:          Overflow (there may be more errors) - Uncorrectable
ERROR:          MISC0 = 0xa000000
ERROR:          MISC1 = 0x22991
ERROR:          MISC2 = 0x0
ERROR:          MISC3 = 0x0
ERROR:          ADDR = 0x80000041cb30f200
ERROR:   **************************************
ERROR:   sdei_dispatch_event returned -1
ERROR:   **************************************
ERROR:   RAS Uncorrectable Error in SCC, base=0xe019000:
ERROR:          Status = 0xec00090d
ERROR:   SERR = Illegal address (software fault): 0xd
ERROR:          IERR = Address Range Error: 0x9
ERROR:          Overflow (there may be more errors) - Uncorrectable
ERROR:          MISC0 = 0x16000000
ERROR:          MISC1 = 0x19991
ERROR:          MISC2 = 0x0
ERROR:          MISC3 = 0x0
ERROR:          ADDR = 0x80000041cb30f040
ERROR:   **************************************
ERROR:   sdei_dispatch_event returned -1
/TC: Secondary CPU 1 initializing
I/TC: Secondary CPU 1 switching to normal world boot
I/TC: Secondary CPU 2 initializing
I/TC: Secondary CPU 2 switching to normal world boot
I/TC: Secondary CPU 3 initializing
I/TC: Secondary CPU 3 switching to normal world boot
I/TC: Secondary CPU 4 initializing
I/TC: Secondary CPU 4 switching to normal world boot
I/TC: Secondary CPU 5 initializing
I/TC: Secondary CPU 5 switching to normal world boot
I/TC: Secondary CPU 6 initializing
I/TC: Secondary CPU 6 switching to normal world boot
I/TC: Secondary CPU 7 initializing
I/TC: Secondary CPU 7 switching to normal world boot

If I power off and power on again. The system boot fine.

The carrier board doesn’t have eeprom and I set

cvb_eeprom_read_size = <0x0>;

I can’t reproduce this issue on the Orin NX module + Xavier NX DevKit(NVMe).

This is uart log file.
UEFI_boot_failed.txt (31.7 KB)

How do I check this error message? Thanks.

Hi Wilson_Lin,

I’ve found a thread about the similar error message as yours. Please refer to that to check if it could help.

Have you referred to Enable PCIe in a Customer CVB Design for your custom board?

Hi Kevin,

Yes I enable the C7x1 on the board but disable C9x1,
because we have a pcie lan chip on the board that connect to C9.
This is my post

I kown the C9 doesn’t work so comment it.

p3767.conf.common

ODMDATA="gbe-uphy-config-9,hsstp-lane-map-3,hsio-uphy-config-0";

cvb/tegra234-p3509-a02-pcie.dtsi

pcie@141e0000 { /* C7x1 node */
                status = "okay";
                phys = <&p2u_gbe_0>;
                phy-names = "p2u-0";
        };

        //pcie@140c0000 { /* C9x1 */
        /*      status = "okay";
                phys = <&p2u_gbe_1>;
                phy-names = "p2u-0";
        };*/

Hi Kevin,

Do you have any update?

Hi,

If the reproduce rate is low, then it is a known issue. And it is not related to C9.

1 Like

Hi WayneWWW,

Will it be fixed next release or later?

Hi WayneWWW,

I’m checked the L4T35.3.1.
The UEFI issue still exists.
Will it resolve in the next release?

Yes, this issue will be fixed in the next relesae.

Hi @kayccc ,could you please share the patch first ?

Hi,
Any update for this?

  1. Will there be a patch for this issue?
  2. this issue will be fixed in the next release . → Do you mean Jetpack 5.1.2?

Thanks.

Hi WakkeWang and wayne_liao,

The RAS error will be fixed in the next JP5.1.2 release.
Sorry that there are too many change to provide the patches for you to verify at this moment.
Please verify this issue after updating to JP5.1.2.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.