Xavier NX shuts down and is stuck on boot, temperature was very high

We were using our Jetson as a main PC for an ROV, and is inside an closed enclosure with other electrical parts. The temperature inside the enclosure was around 40 degrees celsius, we did not see the jetson’s temperature, but when felt after (it shutdown), the jetson was very hot.

Below is the output when reading the serial debug from the jetson 's UART.

[0000.025] W> RATCHET: MB1 binary ratchet value 4 is larger than ratchet level 2 from HW fuses.
[0000.033] I> MB1 (prd-version: 2.3.0.0-t194-41334769-0a17edc1)
[0000.039] I> Boot-mode: Coldboot
[0000.042] I> Platform: Silicon
[0000.044] I> Chip revision : A02P
[0000.047] I> Bootrom patch version : 15 (correctly patched)
[0000.053] I> ATE fuse revision : 0x200
[0000.056] I> Ram repair fuse : 0x0
[0000.059] I> Ram Code : 0x0
[0000.062] I> rst_source: 0x0, rst_level: 0x0
[0000.067] I> Boot-device: QSPI (instance: 0)
[0000.071] I> Qspi flash params source = brbct
[0000.075] I> Qspi clock source : pllp
[0000.078] I> Qspi-0 initialized successfully
[0000.082] I> Boot chain mechanism: A/B
[0000.085] I> Current Boot-Chain Slot: 0
[0000.089] I> BR-BCT Boot-Chain: 0, status: 0. update flag: 0
[0000.094] I> Qspi flash params source = brbct
[0000.101] W> PROD_CONFIG: device prod data is empty in MB1 BCT.
[0000.108] I> Temperature = 34500
[0000.111] W> Skipping boost for clk: BPMP_CPU_NIC
[0000.115] W> Skipping boost for clk: BPMP_APB
[0000.119] W> Skipping boost for clk: AXI_CBB
[0000.123] W> Skipping boost for clk: AON_CPU_NIC
[0000.127] W> Skipping boost for clk: CAN1
[0000.131] W> Skipping boost for clk: CAN2
[0000.135] I> Boot-device: QSPI (instance: 0)
[0000.139] I> Qspi flash params source = mb1bct
[0000.144] I> Qspi clock source : pllc_out0
[0000.147] I> Qspi-0 reinitialized
[0000.150] I> Qspi flash params source = mb1bct
[0000.164] I> Non-ECC region[0]: Start:0x80000000, End:0x100000000
[0000.170] W>  Thermal config not found in BCT
[0000.178] W>  MEMIO rail config not found in BCT
[0000.188] C> Task 0x4d failed (err: 0x8c8c011c)
[0000.193] E> Top caller module: MB1_MSS, error module: MB1_MSS, reason: 0x1c, aux_info: 0x01
[0000.200] C> Error: 0x8c8c011c
[0000.203] C> MB1(2.3.0.0-t194-41334769-0a17edc1) BIT boot status dump :
0000000000111111111001000000000001111011111101110111111111111111111111001111100000000000000000000000000000000000000000000000000010100000000001010000000000000101000000000010100000000000000000000000000000000000000000000000000000000000000000000000000000000000
[0000.233] I> Reset to recovery mode

Nothing shows up after the last line. We are not sure if it is the jetson itself or maybe other potential problems like its storage. [Jetson Xavier NX running Ubuntu 20.04 on an SSD]

Hi althaf000,

Are you using the devkit or custom board for Xavier NX?
What’s your Jetpack version in use?

The current temperature seems normal.
Do you have the serial console log when it shutdown so that we could check what causes it shutdown?

What’s your power source in use? (voltage?)

If your boot stage got stuck in that early MB1 bootloader, then FAN will not work in this stage.

Waiting in this stage will just make the board got temperature increase.

You can try to reflash the board first if it always cannot boot.

Also, answer KevinFFF’s questions above.

Yes, we were thinking about reflashing the board if we can’t find a better solution. Thankfully we can still access the data using an ssd enclosure and backing that up. Is there any resources I can read to know more about the error codes the bootloader is outputting, might be helpful in the future, thank you so much for the input.

If i remember correctly, it’s a reComputer Xavier NX from Seeed Studio. As for the jetpack version, I can’t recall what the specific version is, but I remember It was one of the first ones that supported Ubuntu 20.04 so it must be the early version 5s . I’m sorry but I don’t have the log when it shutdown at that time, and to be clear, the mentioned high temperature was only at the time of incident. The board now is just stuck at boot at idle temp without the fan spinning at all. For the power we were using a 4S battery boosted using a boost converter to 19V (with a DC jack).

Also, this might be quite important to tell that at that time we were trying out the 20W mode, instead of the normal 10W mode (which we have used for the longest time). During using and testing outside of the closed enclosure it was working fine. We tought maybe if it got too hot inside, it’ll just throttle hard, we didn’t really see that during testing outside the enclosure. But after using it in trials (inside the enclosure) for a few minutes, it shut down as I’ve said in the beginning. So we just assumed that after the incident maybe the harsh temps inside the enclosure was what made it stop working.

To add, we also have the Nvidia Jetson Nano Dev Board at our inventory, which we have attempted to swap the Jetson Xavier NX module into the Nvidia Jetson Nano Devboard, yet the result was the same, it did not successfully boot up. We have also determined that the custom Jetson NX carrier board was not the issue, as we have attempted to insert the Jetson Nano module into the custom NX board, and it boots successfully.

We would need to check the serial console log when it shutdown to know if it is caused from high temperature.

Is there other 19V power supply you have to check if it may help?

If you still can not boot up successfully, we would suggest just reflash the board.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.