AGX Xavier - Boot Hangs - what does this error mean?

I was having trouble reflashing the AGX Xavier, so I connected to the serial console to see what might be wrong.

I am getting a strange error – what is happening ?

(skip to the bottom to see these set of messages in context)

[0300.932] E> OCR failed, error = f0f0a06
[0300.933] E> Failed to open sdmmc-3, err = f0f0a06
[0300.933] E> Failed to initialize boot device
[0300.933] E> Top caller module: SDMMC, error module: SDMMC, reason: 0x06, aux_info: 0x0a
[0300.934] I> TBoot-CPU Recovery hang

^^ What is wrong here ??

Full log –

Here is the full output:

��
[0008.929] W> RATCHET: MB1 binary ratchet value 4 is larger than ratchet level 2 from HW fuses.
[0008.937] I> MB1 (prd-version: 2.6.0.0-t194-41334769-cab45716)
[0008.943] I> Boot-mode: Platform RCM
[0008.946] I> Platform: Silicon
[0008.949] I> Chip revision : A02P
[0008.952] I> Bootrom patch version : 15 (correctly patched)
[0008.957] I> ATE fuse revision : 0x200                                                          
[0008.960] I> Ram repair fuse : 0x0                                                              
[0008.963] I> Ram Code : 0x2                                                                     
[0008.966] I> rst_source: 0x0, rst_level: 0x0                                                    
[0008.971] I> USB configuration success                                                          
[0011.091] I> mb2 image downloaded
[0011.114] I> Recovery boot mode 0
[0011.119] I> UPHY full init done
[0011.126] I> MB1 done

[0011.131] I> Welcome to MB2(TBoot-BPMP) Applet (version: default.t194-mobile-dba32e6d)
[0011.138] I> DMA Heap @ [0x40020000 - 0x40065800]
[0011.143] I> Default Heap @ [0xd486400 - 0xd48a400]
[0011.147] W> Profiler not initialized
[0012.151] E> OCR failed, error = f0f0a06
[0012.155] I> SDMMC is not present.
[0012.159] E> SPI_FLASH: Invalid value device id: 7.
[0012.164] I> QSPI Flash is not present.
[0012.170] E> CLK_RST: instance 6 not found in module 44.
[0012.175] E> MPhy CAR configuration failed error = 1747992077
[0012.181] E> UFS initialization failed
[0012.184] I> UFS is not present
[0012.187] W> Profiler not initialized
[0012.190] I> Entering 3p server
[0012.194] I> USB configuration success
[0012.261] I> Populate chip info
[0012.279] I> Populate eeprom info
[0012.283] I> Populate eeprom info for module cvm
[0012.288]  > DEVICE_PROD: device prod is not initialized.
[0012.358] I> Rebooting : reboot-recovery


[0012.363] I> Reset to recovery mode
��
[0292.381] W> RATCHET: MB1 binary ratchet value 4 is larger than ratchet level 2 from HW fuses.
[0292.389] I> MB1 (prd-version: 2.6.0.0-t194-41334769-cab45716)
[0292.394] I> Boot-mode: RCM
[0292.397] I> Platform: Silicon
[0292.400] I> Chip revision : A02P
[0292.403] I> Bootrom patch version : 15 (correctly patched)
[0292.408] I> ATE fuse revision : 0x200
[0292.412] I> Ram repair fuse : 0x0
[0292.415] I> Ram Code : 0x2
[0292.417] I> rst_source: 0xb, rst_level: 0x1
[0292.422] I> USB configuration success
[0294.431] I> bct_bootrom image downloaded
[0294.441] W> PROD_CONFIG: device prod data is empty in MB1 BCT.
[0294.448] I> Temperature = 35500
[0294.452] W> Skipping boost for clk: BPMP_CPU_NIC
[0294.456] W> Skipping boost for clk: BPMP_APB
[0294.460] W> Skipping boost for clk: AXI_CBB
[0294.464] W> Skipping boost for clk: AON_CPU_NIC
[0294.468] W> Skipping boost for clk: CAN1
[0294.472] W> Skipping boost for clk: CAN2
[0294.476] I> Boot-device: SDMMC (instance: 3)
[0294.480] I> bct_mb1 image downloaded
[0294.493] I> Non-ECC region[0]: Start:0x80000000, End:0x100000000
[0294.501] W>  Thermal config not found in BCT
[0294.509] W>  MEMIO rail config not found in BCT
[0294.526] I> bct_mem image downloaded
[0297.573] I> blob image downloaded
[0297.601] I> Recovery boot mode 0
[0297.645] W>  Platform config not found in BCT
[0297.678] I> MB1 done

����main enter
SPE VERSION #: R01.00.18 Created: Jan 29 2021 @ 14:18:27
HW Function test
Start Scheduler.
in late init
��
  [0297.689] I> Welcome to MB2(TBoot-BPMP) Recovery (version: default.t194-mobile-405defbc)
[0297.689] I> DMA Heap @ [0x526fa000 - 0x52ffa000]
[0297.690] I> Default Heap @ [0xd486400 - 0xd48a400]
[0297.692] E> DEVICE_PROD: Invalid value data = 70020000, size = 0.
[0297.697] W> device prod register failed
[0297.713] I> Relocating BR-BCT
[0297.714]  > DEVICE_PROD: device prod is not initialized.
[0297.740] E> I2C: slave not found in slaves.
[0297.741] E> I2C: Could not write 0 bytes to slave: 0x00ae with repeat start true.
[0297.742] E> I2C_DEV: Failed to send register address 0x00000000.
[0297.743] E> I2C_DEV: Could not read 256 registers of size 1 from slave 0xae at 0x00000000 via instance 0.
[0297.743] E> eeprom: Failed to read I2C slave device
[0297.747] I> Failed to read CVB eeprom data @ AE
[0297.751] I> Retrying CVB eeprom read @ AC ...
[0297.787] I> Relocating OP-TEE dtb from: 0x6bfffc10 to 0x70050000, size: 1008
[0297.788] I> [0] START: 0x80000000, SIZE: 0x47af0000
[0297.788] I> [1] START: 0xca000000, SIZE: 0x800000
[0297.789] I> dram_block larger than 80000000
[0297.789] I> [2] START: 0x100000000, SIZE: 0x780000000
[0297.794] I> Setting NS memory ranges to OP-TEE dtb finished.
[0297.797] I> found decompressor handler: lz4
[0298.367] I> EKB detected (length: 0x410) @ VA:0x526fa200
[0298.369] I> Setting EKB blob info to OPTEE dtb finished.
��NOTICE:  BL31: v2.6(release):5e1f8b33d
NOTICE:  BL31: Built : 01:45:47, Aug 28 2024
I/TC: Physical secure memory base 0xcb040000 size 0xf00000
��
  bpmp: init
bpmp: tag is 128431eec76692047e1ac1ebc0392266
sku_dt_init: not sku 0x00
��I/TC: 
��clk_early initialized
mail_early initialized
fuse initialized
hwwdt initialized
t194_ec_get_ec_list: found 45 ecs
ec initialized
vmon_setup_monitors: found 3 monitors
vmon initialized
adc initialized
fmon_populate_monitors: found 73 monitors
fmon initialized
mc initialized
reset initialized
nvhs initialized
uphy_early initialized
emc_early initialized
392 clocks registered
clk initialized
io_dpd initialized
thermal initialized
thermal_mrq initialized
i2c initialized
vrmon_dt_init: vrmon node not found
vrmon_chk_boot_state: found 0 rail monitors
vrmon initialized
regulator initialized
��I/��avfs_clk_platform initialized
��TC: Non-sec��soctherm initialized
��u��aotag initialized
��re ��powergate initialized
��external DT found
��dvs initialized
pm initialized
suspend initialized
pg_late initialized
pg_mrq_init initialized
strap initialized
nvl initialized
emc initialized
emc_mrq initialized
clk_dt initialized
tj_init initialized
/uphy is not enabled status = disabled
uphy_dt initialized
uphy_mrq initialized
uphy initialized
ec_swd_poll_start: 281 reg polling start w period 47 ms
ec_late initialized
hwwdt_late initialized
reset_mrq initialized
ec_mrq initialized
fmon_mrq initialized
clk_mrq initialized
avfs_mrq initialized
mail_mrq initialized
i2c_mrq initialized
tag_mrq initialized
console_mrq initialized
mrq initialized
clk_sync_fmon_post initialized
clk_dt_late initialized
noc_late initialized
pm_post initialized
dbells initialized
dmce initialized
cvc initialized
avfs_clk_mach_post initialized
��I��avfs_clk_platform_post initialized
��/TC��cvc_late initialized
regulator_post initialized
��: OP��rm initialized
console_late initialized
clk_dt_post initialized
��-TE��mc_reg initialized
pg_post initialized
profile initialized
fuse_late initialized
extras_post initialized
bpmp: init complete
entering main console loop
] ��E version: 3.22 (gcc version 9.3.0 (Buildroot 2020.08)) #2 Wed Aug 28 08:55:09 UTC 2024 aarch64
I/TC: WARNING: This OP-TEE configuration might be insecure!
I/TC: WARNING: Please check https://optee.readthedocs.io/en/latest/architecture/porting_guidelines.html
I/TC: Primary CPU initializing
I/TC: Primary CPU switching to normal world boot
��
  [0299.293] I> Welcome to TBoot-CPU Recovery
[0299.294] I> Heap: [0xa0f00000 ... 0xa9f00000
[0299.294] I> gpio framework initialized
[0299.311] I> tegrabl_gpio_driver_register: register 'nvidia,tegra194-gpio' driver
[0299.323] I> tegrabl_gpio_driver_register: register 'nvidia,tegra194-gpio-aon' driver
[0299.323] I> tegrabl_tca9539_init: i2c bus: 1, slave addr: 0x46
[0299.332] W> fetch_driver_phandle_from_dt: failed to get node with compatible ti,tca9539
[0299.339] W> fetch_driver_phandle_from_dt: failed to get node with compatible nxp,tca9539
[0299.340] W> tegrabl_tca9539_init: failed to fetch phandle from dt
[0299.340] I> tegrabl_tca9539_init: i2c bus: 1, slave addr: 0x44
[0299.347] W> fetch_driver_phandle_from_dt: failed to get node with compatible ti,tca9539
[0299.353] W> fetch_driver_phandle_from_dt: failed to get node with compatible nxp,tca9539
[0299.355] W> tegrabl_tca9539_init: failed to fetch phandle from dt
[0299.366] I> fixed regulator driver initialized
[0299.542] I> CPU: Nvidia Carmel
[0299.542] I> CPU: MIDR: 0x4e0f0040, MPIDR: 0x80000000
[0299.543] I> Platform: Silicon
[0299.543] I> chip revision : A02P
[0299.543] I> Boot_device: SDMMC_BOOT instance: 3
[0299.544] I> sdmmc-3 params source = boot args
[0300.932] E> OCR failed, error = f0f0a06
[0300.933] E> Failed to open sdmmc-3, err = f0f0a06
[0300.933] E> Failed to initialize boot device
[0300.933] E> Top caller module: SDMMC, error module: SDMMC, reason: 0x06, aux_info: 0x0a
[0300.934] I> TBoot-CPU Recovery hang

Here’s the full output of the flash
agx_xavier_flash.log (726.3 KB)

What’s the JetPack SW you’re using?
Are you doing the flashing on VM or native Ubuntu OS PC?

I am using native hardware.

Handle 0x0002, DMI type 2, 15 bytes
Base Board Information
Manufacturer: Intel Corporation
Product Name: NUC5PPYB
Version: H76558-108
Serial Number: GEPY72900JXJ

guru@unb:~$ cat /etc/rel

DISTRIB_ID=Ubuntu

DISTRIB_RELEASE=20.04

DISTRIB_CODENAME=focal

DISTRIB_DESCRIPTION=“Ubuntu 20.04.6 LTS”

NAME=“Ubuntu”

VERSION=“20.04.6 LTS (Focal Fossa)”

ID=ubuntu

ID_LIKE=debian

PRETTY_NAME=“Ubuntu 20.04.6 LTS”

VERSION_ID=“20.04”

HOME_URL=“https://www.ubuntu.com/

SUPPORT_URL=“https://help.ubuntu.com/

BUG_REPORT_URL=“Bugs : Ubuntu

PRIVACY_POLICY_URL=“Data privacy | Ubuntu

VERSION_CODENAME=focal

UBUNTU_CODENAME=focal

guru@unb:~$ uname -na

Linux unb 5.15.0-124-generic #134~20.04.1-Ubuntu SMP Tue Oct 1 15:27:33 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

also

guru@unb:~$ sdkmanager --ver

2.2.0.12021

Are you sure the the device in use here is a Jetson AGX Xavier? Is it NV devkit or custom board?

Did you flash other Jetson AGX xavier before?

Yes, I have flashed this exact AGX Xavier. It stopped working suddenly, so I tried to reflash - here we are.

I installed Ubuntu 18.04 on the PC, and tried to flash to 4.6.5.

The same errors are appearing in the serial console while trying to flash -
booting_nvidia.log (13.0 KB)

I am also attaching the logs of the attempt sdkm-2024-11-04-22-37-59.log.
sdkm-2024-11-04-22-37-59.log (398.4 KB)

And a picture of the S/N and Board.

In direct reply to your question if I have flashed another - I do have a different Jetson NX that i have not attempted to flash recently, however I did flash this some time ago.

( NVIDIA Jetson Xavier NX Developer Kit (812674024318), 16 GB )

Hi bryant.eadon,
To obtain additional information, please use the EXPORT LOGS command to generate a log dump, as illustrated in the image provided.

Just to clarify my question again.

  1. I was just to make sure you are not a new guy who never flashed any Jetson before.

  2. If you ever used the same method to flash this Jetson AGX Xavier and it was working before but it is dead now, I would think it might be hardware defect. Do you have other Jetson AGX Xavier to cross check if the carrier board is still fine?

  1. Yes, I have flashed a jetson before, but only 1-2 times successfully early on. I was fortunate in that things “just worked” when I followed the instructions, and I was able to use it for a few models. Lots of attempts unsuccessfully recently - likely all around this emmc module… Or I am doing something completely wrong.

  2. I do not have another Xavier AGX unfortunately. However, I do have a multimeter and an oscilliscope. Are there testpoints, voltages or pins that I should look for signals ?

Is there a common problem with power supplies going to the EMMC ? (for example - I could check for 5V, or 3.3V … etc… ).

The board seems almost working I just can’t tell what’s wrong. Should I boot some kind of internal console, in another post I was reading about some inner console that starts with “nv>” ? ( perhaps this is when the serial console is working properly? or should I press some keys while it’s booting to go someplace ? )

DavidLLL :

I just get this forever –

Occassionally, if I wait long enough I get a prompt about things taking a long time, asking me if I want to cancel.

I have waited 23 hours to wait for this to progress previously before finally cancelling. Waiting does not do anything.

I do not have the option to export logs in this UI.

I would consider this a hardware defect. Please do the RMA for this device.

When I press “Pause for a Bit” I get “Pausing…” and “Completing current task” – which also never finishes.

Screenshot -

No option to export logs in the UI when it’s stuck like this. Which logs do you want to see? I scraped the previous ones from the ~/.nvsdkm directory.

For tonight’s attempt with 18.04 -
I’ve tar gz’d this for you and attached it here if that’s where the logs are located.
nvsdkm.tar.gz (475.1 KB)

I opened a support ticket and listed this forum message. I will respond in the official ticket about this assessment.

1 Like

finally presented me with a cancel message, and now I have an export logs button.

zipfile attached.
SDKM_logs_JetPack_4.6.5_Linux_for_Jetson_AGX_Xavier_2024-11-05_22-13-36.zip (175.9 KB)

Hi,
With latest log, as Wayne said it appears that a hardware defect.
At this point, it might be advisable to halt further diagnostic checks and proceed with the RMA.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.