Bricked Jetson Xavier AGX

Problem: My Jetson Xavier AGX has been working great until it got ‘bricked’ yesterday, when it suddenly refused to boot. Its power light will turn and remain on when booted in normal and recovery mode, and the indicator LED on my computer monitor will change colour (indicating that it is connected to a running computer) but the monitor display will remain black. Is there any way to resolve this without reflashing the Jetson?

Troubleshooting information: my Jetson is running the Jetpack version of Ubuntu 20.04. I believe the power to my Jetson may have been cut before it shutdown properly after work last night. I’ve attempted to connect to it using gtkterm (which has worked in the past) but only get the error message Control signals read: Input/Output error

Hi,

If you don’t want to reflash the board, I would suggest you can learn to check serial console first.

Most of newbie don’t know how to dump it correctly, so please ask if you hit any problem with above setup.

I will add: A serial console does not need networking, it doesn’t need video to work, so on. The program which does this actually runs on another computer, so much of the Jetson can crash and burn and serial console will still work. The program which acts as the terminal on the other computer can be told to log everything, even before you power on the Jetson; this log is quite valuable to attach to the forum because it shows log information from early boot stages (long before Linux or the GUI is ever reached).

AGX Xavier in particular is quite convenient because it has a micro-OTG USB slot (micro-USB with an ID detect that knows if it is a type-A micro-USB or type-B micro-USB) with a UART built in and view boot log information. The host PC has to have the correct chipset driver, but most Linux PCs have this by default (especially Ubuntu).

Incidentally, to be bricked means that it cannot be recovered and is dead until BIOS firmware memory is actually removed and flashed on dedicated hardware, then soldered back on. Jetsons don’t have a BIOS, and cannot be bricked like that. Instead, when in recovery mode, a Jetson is a custom USB device understood by the custom USB driver (appropriately named the “driver package”, and installed and run by the GUI front end to the flash software, JetPack/SDK Manager). About the only way to brick a Jetson is from actual hardware destruction (there are some i2c commands which can cause problems, but you have to try hard to do that). So it isn’t bricked, and even if you did have to flash, you could first clone the existing rootfs partition and have access to it (or use that for reflashing if it is non-rootfs content which is failing).

Thank you both for your replies! I couldn’t receive any serial console logs via gtkterm, but I will try minicom in the lab tomorrow and post an update on the logs I receive.

If the serial console is not working, then on the host PC, monitor “dmesg --follow”. As you plug in the Jetson (with its power on, or off, but then as power goes on) you should see new log lines on the host PC. Be sure to mention what plugging in the powered Jetson produces in the host PC logs if serial console is not working.

I’ve confirmed that the Jetson is recognized via lsusb and as ports ttyUSB0 - 3 when plugged in through the micro-OTG USB slot, but when powering off and back on while running ‘dmesg --follow’, there is no change in the dmesg log.

I’ve installed and configured minicom according to the instructions provided in the RidgeRun wiki, and upon powering up, the following logs are broadcasted, and then the serial console goes blank: (sorry it’s a long output)

��
[0000.084] W> RATCHET: MB1 binary ratchet value 4 is larger than ratchet level 2 from HW fuses.
[0000.092] I> MB1 (prd-version: 2.3.0.0-t194-41334769-0a17edc1)
[0000.097] I> Boot-mode: Coldboot
[0000.100] I> Platform: Silicon
[0000.103] I> Chip revision : A02P
[0000.106] I> Bootrom patch version : 15 (correctly patched)
[0000.111] I> ATE fuse revision : 0x200
[0000.115] I> Ram repair fuse : 0x0
[0000.118] I> Ram Code : 0x2
[0000.120] I> rst_source: 0x0, rst_level: 0x0
[0000.125] I> Boot-device: SDMMC (instance: 3)
[0000.141] I> sdmmc DDR50 mode
[0000.145] I> Boot chain mechanism: A/B
[0000.149] I> Current Boot-Chain Slot: 0
[0000.152] I> BR-BCT Boot-Chain: 0, status: 0. update flag: 0
[0000.159] W> PROD_CONFIG: device prod data is empty in MB1 BCT.
[0000.166] I> Temperature = 29000
[0000.169] W> Skipping boost for clk: BPMP_CPU_NIC
[0000.173] W> Skipping boost for clk: BPMP_APB
[0000.177] W> Skipping boost for clk: AXI_CBB
[0000.181] W> Skipping boost for clk: AON_CPU_NIC
[0000.186] W> Skipping boost for clk: CAN1
[0000.189] W> Skipping boost for clk: CAN2
[0000.194] I> Boot-device: SDMMC (instance: 3)
[0000.204] I> Sdmmc: HS400 mode enabled
[0000.209] I> Non-ECC region[0]: Start:0x80000000, End:0x100000000
[0000.216] W> Thermal config not found in BCT
[0000.224] W> MEMIO rail config not found in BCT
[0000.246] I> sdmmc bdev is already initialized
[0000.291] W> Platform config not found in BCT
[0000.325] I> MB1 done

����main enter
SPE VERSION #: R01.00.18 Created: Jan 29 2021 @ 14:18:27
HW Function test
Start Scheduler.
in late init
��
[0000.334] I> Welcome to MB2(TBoot-BPMP) (version: default.t194-mobile-1ca012e4)
[0000.335] I> DMA Heap @ [0x526fa000 - 0x52ffa000]
[0000.335] I> Default Heap @ [0xd486400 - 0xd48a400]
[0000.336] E> DEVICE_PROD: Invalid value data = 70020000, size = 0.
[0000.342] W> device prod register failed
[0000.345] I> gpio framework initialized
[0000.349] I> tegrabl_gpio_driver_register: register ‘nvidia,tegra194-gpio’ driver
[0000.356] I> tegrabl_gpio_driver_register: register ‘nvidia,tegra194-gpio-aon’ driver
[0000.364] I> No valid sdcard_params in mb1_bct
[0000.368] I> Boot_device: SDMMC_BOOT instance: 3
[0000.373] I> sdmmc-3 params source = boot args
[0000.382] I> sdmmc-3 params source = boot args
[0000.382] I> sdmmc bdev is already initialized
[0000.415] I> Found 20 partitions in SDMMC_BOOT (instance 3)
[0000.431] I> Found 41 partitions in SDMMC_USER (instance 3)
[0000.432] I> Active Boot chain : 0
[0000.490] I> Relocating BR-BCT
[0000.492] > DEVICE_PROD: device prod is not initialized.
[0000.517] E> I2C: slave not found in slaves.
[0000.518] E> I2C: Could not write 0 bytes to slave: 0x00ae with repeat start true.
[0000.519] E> I2C_DEV: Failed to send register address 0x00000000.
[0000.520] E> I2C_DEV: Could not read 256 registers of size 1 from slave 0xae at 0x00000000 via instance 0.
[0000.521] E> eeprom: Failed to read I2C slave device
[0000.524] I> Failed to read CVB eeprom data @ AE
[0000.528] I> Retrying CVB eeprom read @ AC …
[0000.602] I> Relocating OP-TEE dtb from: 0x6bfff1d0 to 0x70050000, size: 1008
[0000.603] I> [0] START: 0x80000000, SIZE: 0x2f000000
[0000.603] I> [1] START: 0xaf010000, SIZE: 0x18bf0000
[0000.604] I> [2] START: 0xc7d00000, SIZE: 0xc0000
[0000.604] I> [3] START: 0xca000000, SIZE: 0x800000
[0000.605] I> dram_block larger than 80000000
[0000.607] I> [4] START: 0x100000000, SIZE: 0x780000000
[0000.619] I> Setting NS memory ranges to OP-TEE dtb finished.
[0000.633] I> found decompressor handler: lz4
[0001.129] I> EKB detected (length: 0x410) @ VA:0x526ff400
[0001.131] I> Setting EKB blob info to OPTEE dtb finished.
��NOTICE: BL31: v2.6(release):4fa405dbd
NOTICE: BL31: Built : 20:16:55, Aug 10 2022
I/TC:
��
��I/TC: Non-secure external DT found
��bpmp: init
bpmp: tag is 128431eec76692047e1ac1ebc0392266
sku_dt_init: not sku 0x00
clk_early initialized
mail_early initialized
fuse initialized
hwwdt initialized
t194_ec_get_ec_list: found 45 ecs
ec initialized
vmon_setup_monitors: found 3 monitors
vmon initialized
adc initialized
fmon_populate_monitors: found 73 monitors
fmon initialized
mc initialized
reset initialized
nvhs initialized
uphy_early initialized
emc_early initialized
392 clocks registered
clk initialized
io_dpd initialized
thermal initialized
thermal_mrq initialized
i2c initialized
vrmon_dt_init: vrmon node not found
vrmon_chk_boot_state: found 0 rail monitors
vrmon initialized
regulator initialized
��I/TC: OP-TEE version: 3.16 (gcc versi��avfs_clk_platform initialized
��on 9.3.0 (B��soctherm initialized
��ui��aotag initialized
��ld��powergate initialized
��root 2020.08)) #2 Thu Aug 11 03:23:20 UTC 2022 aarch64
I/TC: WARNING: This OP-TEE configuration might be insecure!
I/TC: WARNING: Please check https://optee.readthedocs.io/en/latest/ar��dvs initialized
��c��pm initialized
��hi��suspend initialized
��tec��pg_late initialized
��tu��pg_mrq_init initialized
strap initialized
��r��nvl initialized
��e/porting_��emc initialized
emc_mrq initialized
��guidelines.html
I/TC: Primary CPU initializing
��clk_dt initialized
tj_init initialized
uphy_dt initialized
uphy_mrq initialized
uphy initialized
ec_swd_poll_start: 281 reg polling start w period 47 ms
ec_late initialized
hwwdt_late initialized
reset_mrq initialized
ec_mrq initialized
fmon_mrq initialized
clk_mrq initialized
avfs_mrq initialized
mail_mrq initialized
i2c_mrq initialized
tag_mrq initialized
console_mrq initialized
mrq initialized
clk_sync_fmon_post initialized
clk_dt_late initialized
noc_late initialized
pm_post initialized
dbells initialized
dmce initialized
cvc initialized
avfs_clk_mach_post initialized
avfs_clk_platform_post initialized
cvc_late initialized
regulator_post initialized
rm initialized
console_late initialized
clk_dt_post initialized
mc_reg initialized
pg_post initialized
profile initialized
fuse_late initialized
extras_post initialized
bpmp: init complete
entering main console loop
] ��I/TC: Primary CPU switching to normal world boot
��
[0001.782] I> Welcome to NVDisp-Init
[0001.782] I> NVDisp-Init version: t194-f9ecfedc
[0001.782] I> CPU-BL Params @ 0xca020000
[0001.783] I> 0) Base:0x00000000 Size:0x00000000
[0001.783] I> 1) Base:0xc8300000 Size:0x00100000
[0001.783] I> 2) Base:0xc9800000 Size:0x00200000
[0001.784] I> 3) Base:0xc8600000 Size:0x00200000
[0001.786] I> 4) Base:0xc8200000 Size:0x00100000
[0001.791] I> 5) Base:0xc8100000 Size:0x00100000
[0001.795] I> 6) Base:0xc9400000 Size:0x00400000
[0001.799] I> 7) Base:0xc9000000 Size:0x00400000
[0001.804] I> 8) Base:0xc8000000 Size:0x00100000
[0001.808] I> 9) Base:0xc7f00000 Size:0x00100000
[0001.813] I> 10) Base:0xca800000 Size:0x00800000
[0001.817] I> 11) Base:0x40000000 Size:0x00040000
[0001.822] I> 12) Base:0xc7e00000 Size:0x00100000
[0001.826] I> 13) Base:0x40046000 Size:0x00002000
[0001.831] I> 14) Base:0x40048000 Size:0x00002000
[0001.835] I> 15) Base:0xaf000000 Size:0x00004000
[0001.840] I> 16) Base:0x4004a000 Size:0x00002000
[0001.844] I> 17) Base:0xc7c00000 Size:0x00100000
[0001.849] I> 18) Base:0x4004c000 Size:0x00002000
[0001.853] I> 19) Base:0xc9a00000 Size:0x00600000
[0001.857] I> 20) Base:0x4004e000 Size:0x00002000
[0001.862] I> 21) Base:0xc7dc0000 Size:0x0000c000
[0001.866] I> 22) Base:0x00000000 Size:0x00000000
[0001.871] I> 23) Base:0xc7de0000 Size:0x00020000
[0001.875] I> 24) Base:0xcc000000 Size:0x02000000
[0001.880] I> 25) Base:0x40050000 Size:0x00002000
[0001.884] I> 26) Base:0x40040000 Size:0x00006000
[0001.889] I> 27) Base:0xc8c00000 Size:0x00400000
[0001.893] I> 28) Base:0xc8400000 Size:0x00200000
[0001.898] I> 29) Base:0xc8800000 Size:0x00400000
[0001.902] I> 30) Base:0xc7dd0000 Size:0x00010000
[0001.907] I> 31) Base:0x00000000 Size:0x00000000
[0001.911] I> 32) Base:0xf8000000 Size:0x08000000
[0001.915] I> 33) Base:0xce000000 Size:0x2a000000
[0001.920] I> 34) Base:0xcb000000 Size:0x01000000
[0001.924] I> 35) Base:0xae000000 Size:0x01000000
[0001.929] I> 36) Base:0xa0000000 Size:0x0e000000
[0001.933] I> 37) Base:0xca000000 Size:0x00800000
[0001.938] I> 38) Base:0x80000000 Size:0x20000000
[0001.942] I> 39) Base:0xb0000000 Size:0x08000000
[0001.947] I> 40) Base:0x00000000 Size:0x00000000
[0001.951] I> 41) Base:0x00000000 Size:0x00000000
[0001.956] I> 42) Base:0x00000000 Size:0x00000000
[0001.960] I> 43) Base:0x00000000 Size:0x00000000
[0001.965] I> 44) Base:0x00000000 Size:0x00000000
[0001.969] I> 45) Base:0x00000000 Size:0x00000000
[0001.973] GIC-SPI Target CPU: 0
[0001.976] Interrupts Init done
[0001.979] calling constructors
[0001.982] initializing heap
[0001.985] I> Heap: [0xa0960000 … 0xadf00000]
[0001.989] initializing threads
[0001.992] initializing timers
[0001.995] creating bootstrap completion thread
[0001.999] top of bootstrap2()
[0002.002] CPU: MIDR: 0x4E0F0040, MPIDR: 0x80000000
[0002.007] initializing platform
[0002.010] E> DEVICE_PROD: Invalid value data = 0, size = 0.
[0002.015] W> device prod register failed
[0002.019] I> Bl_dtb @0xadf00000
[0002.022] I> gpio framework initialized
[0002.034] I> tegrabl_gpio_driver_register: register ‘nvidia,tegra194-gpio’ driver
[0002.041] I> tegrabl_gpio_driver_register: register ‘nvidia,tegra194-gpio-aon’ driver
[0002.048] I> fixed regulator driver initialized
[0002.064] I> register ‘maxim’ power off handle
[0002.066] I> virtual i2c enabled
[0002.067] I> registered ‘maxim,max20024’ pmic
[0002.067] I> tegrabl_gpio_driver_register: register ‘max20024-gpio’ driver
[0002.068] I> Boot-device: eMMC
[0002.068] I> Boot_device: SDMMC_BOOT instance: 3
[0002.072] I> sdmmc-3 params source = boot args
[0002.075] W> No board IDs available
[0002.078] E> Failed to get board id info!
[0002.082] I> sdmmc bdev is already initialized
[0002.086] I> sdmmc-3 params source = boot args
[0002.093] I> Found 20 partitions in SDMMC_BOOT (instance 3)
[0002.098] I> Found 41 partitions in SDMMC_USER (instance 3)
[0002.123] I> enabling ‘vdd-hdmi-5v0’ regulator
[0002.130] I> regulator ‘vdd-hdmi-5v0’ already enabled
[0002.131] I> hdmi cable connected
[0002.136] W> set volts not configured for ‘vdd-1v0’
[0002.141] W> set volts not configured for ‘vdd-1v8-hs’
[0002.141] I> retrieved tmds range from prod_list_hdmi_soc
[0002.148] E> invalid display type
[0002.153] E> invalid display type
[0002.155] E> cannot find any other nvdisp nodes
[0002.171] I> edid read success
[0002.184] I> edid read success
[0002.184] I> width = 640, height = 480, frequency = 25174825
[0002.184] I> width = 640, height = 480, frequency = 25174825
[0002.185] I> width = 640, height = 480, frequency = 25174825
[0002.185] I> width = 640, height = 480, frequency = 25174825
[0002.186] I> width = 1920, height = 1080, frequency = 148500000
[0002.189] I> width = 1920, height = 1080, frequency = 148500000
[0002.195] I> width = 1920, height = 1080, frequency = 148351648
[0002.201] I> width = 1280, height = 720, frequency = 74175824
[0002.207] I> width = 720, height = 480, frequency = 26973026
[0002.212] I> width = 720, height = 480, frequency = 26973026
[0002.218] I> width = 1920, height = 1080, frequency = 148351648
[0002.223] I> width = 1280, height = 720, frequency = 74175824
[0002.229] I> width = 720, height = 576, frequency = 26973026
[0002.234] I> width = 720, height = 576, frequency = 26973026
[0002.240] I> width = 640, height = 480, frequency = 25174825
[0002.246] I> Best mode Width = 1920, Height = 1080, freq = 148351648
[0002.257] I> hdmi_enable, starting HDMI initialisation
[0002.262] I> hdmi_enable, HDMI initialisation complete
[0002.270] initializing target
[0002.271] calling apps_init()
[0002.272] starting app kernel_boot_app
[0002.273] I> Kernel type = Normal

Jetson UEFI firmware (version 1.0-d7fb19b built on 2022-08-10T20:18:13-07:00)

Looks like it just got stuck in UEFI. I would suggest to directly relfash the board.

Okay, that makes sense. I’ve since attempted to re-flash the board via SDKManager, however the Jetson is not detected by SDKManager or ‘lsusb’ when connected to the host computer (running Ubuntu 20.04) via USB-C cable.

And when I type ‘dmesg --follow’ in the command line when connecting the Jetson to the host computer, it says:

[90034.979157] usb usb2-port9: config error
[90039.074525] usb usb2-port9: Cannot enable. Maybe the USB cable is bad?

even though I know that this USB cable works. Could this be a hardware failure or is there another reason it won’t start, connect, or allow itself to be re-flashed?

I should add that on my first attempt to re-flash the board, SDKManager did not initially detect my board but successfully detected it on Step 3 of the flash process. I chose to flash to NVMe with the understanding that it would then load the OS on my Jetson’s SSD. That flash process failed, and since then SDKManager won’t even connect with my board.

Did you remember to put the board into recovery mode by pressing the button?

Or you don’t know what I am talking about here?

Yes, I’ve put it in recovery mode when attempting to re-flash (recovery button + power)

This morning I’ve been testing it on a handful of other computers with 5 other USB-C cables, that I’ve tried connecting to each of the Jetson’s ports with no luck.

The robot platform I’m running the Jetson on automatically shuts down when the battery reaches a certain level, and I wonder if my Jetson just experienced one of those sudden shutdowns as it was transferring critical data. Would an untimely shutdown be enough to brick it?

Shutting down won’t brick a Jetson. Jetsons don’t have a BIOS, so they are immune to most issues which might brick a PC. I will add that Jetsons are quite sensitive to power regulation. If the quality is not quite good, then they can shut down, and this might lead to filesystem corruption.

Incidentally, only the designated port works for recovery mode. On the host PC you might run “dmesg --follow” before plugging in the recovery mode Jetson. Hold down the recovery button, and then add power to the Jetson, followed by letting go of the recovery button. Then watch the host PC’s “dmesg --follow”. Plug in the Jetson, and see if anything logs. That log would show at least detection of the USB even if there is some error; if nothing at all appears, then it might be a USB issue or actual hardware failure.

Thanks @linuxdev. Unfortunately, when I boot in recovery mode the only message published on the dmesg --follow log is:

usb usb2-port9: config error
usb usb2-port9: Cannot enable. Maybe the USB cable is bad?

And minicom simply provides the log I pasted above when booted in recovery mode. I’ve looked into that ‘dmesg --follow’ log error elsewhere on this forum and see that others have had the same issue, however most of them can still boot their Jetson and can thus change config files, whereas my Jetson seemingly gets stuck booting the firmware and thus won’t allow me to change anything on the software side.

Is the config error from the host PC’s “dmesg --follow”? I ask because normally there is a timestamp on the left. Keep in mind that I am asking about plugin of the cable used for flashing; if that is from serial console plugin, then it is a different issue. Either way, this could be from a USB protocol error, which in turn could be from signal quality or hardware or software issues. In the case of a recovery mode Jetson it would rather difficult to change recovery mode software/firmware content (it could be done with great effort and a lot of knowledge, but very few people could achieve that). That leaves signal quality or hardware as the most likely answer. Incidentally, signal quality issues are often impossible to differentiate from hardware issues. I think you’d need a proper USB protocol analyzer if this is the only method of determining failure. It would be quite useful to see the logs occurring on other Linux PCs as you plug in the Jetson (with the Jetson already in recovery mode). If it is the same message (and not just failure) with different hosts and cables, then it leans towards hardware failure (even so, you’d still need a USB protocol analyzer to confirm).

Yes, it is the config error from the host PC’s dmesg --follow. I removed the timestamp for readability sake in this forum but both lines were published with timestamps.

I flashed with the USB-C port on the same side of the Xavier as the serial console’s micro USB port. The USB-C port, not the serial console port, is the port that dmesg --follow says has the config error.

It is best to not remove anything, but in this case it won’t matter.

Let’s do a comparison. If you plug in that cable with recovery mode and get that log, does it do the same thing if you plug in that cable (same connector) when the Jetson is already started, but not in recovery mode? If it is the same in both modes, then it might be a PHY error. If this changes the error, then it is more likely a software error.

Great news - this morning my colleague plugged it into his computer, and for some reason SDKManager on his computer detected the Jetson, so we successfully flashed it on the EMMc and it is now back to brand new! Thank you @linuxdev and @WayneWWW for your assistance.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.