UEFI boot failed : RAS Uncorrectable Error in SCC

Hi Nvidia,

I have a Orin NX 16GB module on a customized carrier board.
It’s installed the L4T35.2.1 on the system and work fine.

We tried to test the stability of system.
When we reboot the system many times. (maybe 300~1000 times or more)
I got following error message on UEFI and boot failed.

I> MB2 finished

▒▒NOTICE:  BL31: v2.6(release):6363e7382
NOTICE:  BL31: Built : 15:09:30, Jan 24 2023
ERROR:   **************************************
ERROR:   RAS Uncorrectable Error in SCC, base=0xe019000:
ERROR:          Status = 0xec00090d
ERROR:   SERR = Illegal address (software fault): 0xd
ERROR:          IERR = Address Range Error: 0x9
ERROR:          Overflow (there may be more errors) - Uncorrectable
ERROR:          MISC0 = 0xe000000
ERROR:          MISC1 = 0x4e991
ERROR:          MISC2 = 0x0
ERROR:          MISC3 = 0x0
ERROR:          ADDR = 0x80000041cb30f040
ERROR:   **************************************
ERROR:   sdei_dispatch_event returned -1
/TC:
I/TC: Non-secure external DT found
I/TC: OP-TEE version: 3.19 (gcc version 9.3.0 (Buildroot 2020.08)) #2 Tue Jan 24 23:20:42 UTC 2023 aarch64
I/TC: WARNING: This OP-TEE configuration might be insecure!
I/TC: WARNING: Please check https://optee.readthedocs.io/en/latest/architecture/porting_guidelines.html
I/TC: Primary CPU initializing
I/TC: WARNING: Test OEM keys are being used!
I/TC: This is only for TZ-SE testing and should NOT be used for a shipping product!
I/TC: Primary CPU switching to normal world boot
▒▒
  Jetson UEFI firmware (version 2.1-32413640 built on 2023-01-24T23:12:27+00:00)

Jetson UEFI firmware (version 2.1-32413640 built on 2023-01-24T23:12:27+00:00)
ESC   to enter Setup.
F11   to enter Boot Manager Menu.
Enter to continue boot.
**  WARNING: Test Key is used.  **
......
      L4TLauncher: Attempting Direct Boot
EFI stub: Booting Linux Kernel...
EFI stub: Using DTB from configuration table
EFI stub: Loaded initrd from LINUX_EFI_INITRD_MEDIA_GUID device path
EFI stub: Exiting boot services and installing virtual address map...
▒▒ERROR:   **************************************
ERROR:   RAS Uncorrectable Error in SCC, base=0xe018000:
ERROR:          Status = 0xec00090d
ERROR:   SERR = Illegal address (software fault): 0xd
ERROR:          IERR = Address Range Error: 0x9
ERROR:          Overflow (there may be more errors) - Uncorrectable
ERROR:          MISC0 = 0xa000000
ERROR:          MISC1 = 0x22991
ERROR:          MISC2 = 0x0
ERROR:          MISC3 = 0x0
ERROR:          ADDR = 0x80000041cb30f200
ERROR:   **************************************
ERROR:   sdei_dispatch_event returned -1
ERROR:   **************************************
ERROR:   RAS Uncorrectable Error in SCC, base=0xe019000:
ERROR:          Status = 0xec00090d
ERROR:   SERR = Illegal address (software fault): 0xd
ERROR:          IERR = Address Range Error: 0x9
ERROR:          Overflow (there may be more errors) - Uncorrectable
ERROR:          MISC0 = 0x16000000
ERROR:          MISC1 = 0x19991
ERROR:          MISC2 = 0x0
ERROR:          MISC3 = 0x0
ERROR:          ADDR = 0x80000041cb30f040
ERROR:   **************************************
ERROR:   sdei_dispatch_event returned -1
/TC: Secondary CPU 1 initializing
I/TC: Secondary CPU 1 switching to normal world boot
I/TC: Secondary CPU 2 initializing
I/TC: Secondary CPU 2 switching to normal world boot
I/TC: Secondary CPU 3 initializing
I/TC: Secondary CPU 3 switching to normal world boot
I/TC: Secondary CPU 4 initializing
I/TC: Secondary CPU 4 switching to normal world boot
I/TC: Secondary CPU 5 initializing
I/TC: Secondary CPU 5 switching to normal world boot
I/TC: Secondary CPU 6 initializing
I/TC: Secondary CPU 6 switching to normal world boot
I/TC: Secondary CPU 7 initializing
I/TC: Secondary CPU 7 switching to normal world boot

If I power off and power on again. The system boot fine.

The carrier board doesn’t have eeprom and I set

cvb_eeprom_read_size = <0x0>;

I can’t reproduce this issue on the Orin NX module + Xavier NX DevKit(NVMe).

This is uart log file.
UEFI_boot_failed.txt (31.7 KB)

How do I check this error message? Thanks.

Hi Wilson_Lin,

I’ve found a thread about the similar error message as yours. Please refer to that to check if it could help.

Have you referred to Enable PCIe in a Customer CVB Design for your custom board?

Hi Kevin,

Yes I enable the C7x1 on the board but disable C9x1,
because we have a pcie lan chip on the board that connect to C9.
This is my post

I kown the C9 doesn’t work so comment it.

p3767.conf.common

ODMDATA="gbe-uphy-config-9,hsstp-lane-map-3,hsio-uphy-config-0";

cvb/tegra234-p3509-a02-pcie.dtsi

pcie@141e0000 { /* C7x1 node */
                status = "okay";
                phys = <&p2u_gbe_0>;
                phy-names = "p2u-0";
        };

        //pcie@140c0000 { /* C9x1 */
        /*      status = "okay";
                phys = <&p2u_gbe_1>;
                phy-names = "p2u-0";
        };*/

Hi Kevin,

Do you have any update?

Hi,

If the reproduce rate is low, then it is a known issue. And it is not related to C9.

1 Like

Hi WayneWWW,

Will it be fixed next release or later?