Jetson orin nano recovery boot problem

Problem Background:
We have installed the Jetson Orin Nano core module on a custom-made baseboard, providing only a 512G solid-state drive as the boot device for the core module.

Problem Description: We have found that one of the core modules has the following issue: after multiple reboots, the core module enters recovery boot mode and cannot work properly; we can temporarily resolve this issue by modifying the OS chain A status variable in the BIOS, but the problem reoccurs after several reboots; currently, this issue has only been observed on one Jetson Orin Nano.

Log Description: The following is a set of serial print logs from the Jetson Orin Nano core module. This set of logs records three reboot processes (you can distinguish the boundaries of these three log segments by searching for “Rebooting system”). In the first two reboots, the core module can boot normally in direct boot mode, but in the third reboot, the core module enters recovery boot mode.
recovery_boot_log.txt (231.8 KB)

Hi 943733826,

What’s the Jetpack version in use? JP6.0(r36.3.0)?

Are you performing the reboot test?
If so, have you confirmed that nv-l4t-bootloader-config.service is loaded before you execute reboot operation?
You can refer to the following similar topics for details.
Continuous reboots will fail to boot into OS - Jetson & Embedded Systems / Jetson AGX Orin - NVIDIA Developer Forums
Reset Timing of Boot Retry Count - #7 by KevinFFF
Orin 35.4.1在部署软件然后重启不定次数会进入recovery mode.请问您有什么建议吗? - Jetson & Embedded Systems / Jetson AGX Orin - NVIDIA Developer Forums

1 Like

Thank you for the guidance. Based on the information provided, we have determined the following:
1.The Jetpack version is JP6.1.
2.The nv-l4t-bootloader-config.service file can be found in the /opt/nvidia/l4t-bootloader-config directory, but the service remains in an inactive state by default.
3.We attempted to start the service using the command sudo systemctl start nv-l4t-bootloader-config.service, but after execution, the service still remains in an inactive state. The service status log is as follows:

root@sinovatio-desktop:/data# sudo systemctl status nv-l4t-bootloader-config.service >/data/test111
root@sinovatio-desktop:/data# cat test111
鈼?nv-l4t-bootloader-config.service - Configure bootloader service
Loaded: loaded (/etc/systemd/system/nv-l4t-bootloader-config.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Wed 1969-12-31 19:06:35 EST; 1min 20s ago
Process: 2619 ExecStart=/opt/nvidia/l4t-bootloader-config/nv-l4t-bootloader-config.sh -v (code=exited, status=0/SUCCESS)
Main PID: 2619 (code=exited, status=0/SUCCESS)
CPU: 431ms

Dec 31 19:06:35 sinovatio-desktop nv-l4t-bootloader-config.sh[2886]: COMPATIBLE_SPEC 3767–0003–1–jetson-orin-nano-devkit-
Dec 31 19:06:35 sinovatio-desktop nv-l4t-bootloader-config.sh[2886]: TEGRA_BOOT_STORAGE nvme0n1
Dec 31 19:06:35 sinovatio-desktop nv-l4t-bootloader-config.sh[2886]: TEGRA_CHIPID 0x23
Dec 31 19:06:35 sinovatio-desktop nv-l4t-bootloader-config.sh[2886]: TEGRA_OTA_BOOT_DEVICE /dev/mtdblock0
Dec 31 19:06:35 sinovatio-desktop nv-l4t-bootloader-config.sh[2886]: TEGRA_OTA_GPT_DEVICE /dev/mtdblock0
Dec 31 19:06:35 sinovatio-desktop nv-l4t-bootloader-config.sh[2619]: Info: Write TegraPlatformCompatSpec with 3767–0003–1–jetson-orin-nano-devkit-.
Dec 31 19:06:35 sinovatio-desktop nv-l4t-bootloader-config.sh[2619]: Info. Verifying boot status.
Dec 31 19:06:35 sinovatio-desktop nv-l4t-bootloader-config.sh[2891]: Info: variable BootChainFwStatus is not found.
Dec 31 19:06:35 sinovatio-desktop systemd[1]: nv-l4t-bootloader-config.service: Deactivated successfully.
Dec 31 19:06:35 sinovatio-desktop systemd[1]: Finished Configure bootloader service.
root@sinovatio-desktop:/data#

Is this normal? What operations or environment settings might be causing the issue?

Here are some additional findings:
1.We can confirm that the faulty device fails after rebooting 3 times, while powering on and off 10 times at the same frequency does not result in a failure, which is consistent with the case you provided.
2.Our device can be roughly divided into three parts: the Nano core board, the base board, and the solid-state drive (SSD) as the only boot device. It has been confirmed that the fault is related to a specific SSD and is not related to any Nano core board or base board.
3.It has been confirmed that both the normal SSD and the faulty SSD have the nv-l4t-bootloader-config.service file in the /opt/nvidia/l4t-bootloader-config directory, and the MD5 checksum results are the same.
4.The logs in the previous response are from a normal SSD. In the environment with a normal SSD, starting the service with sudo systemctl start nv-l4t-bootloader-config.service completes in about 1 second.
5.In the environment with the faulty SSD, starting the service with sudo systemctl start nv-l4t-bootloader-config.service gets stuck indefinitely, and it can only be exited using Ctrl+C. The logs in this environment are as follows:

○ nv-l4t-bootloader-config.service - Configure bootloader service
Loaded: loaded (/etc/systemd/system/nv-l4t-bootloader-config.service; enabled; vendor preset: enabled)
Active: inactive (dead)

Additional Findings
We found that the nv-l4t-bootloader-config.service on the faulty solid-state drive gets stuck at its dependent nvgetty.service. After manual testing, we discovered that the two dependent services of nvgetty.service, nv.service and nvpmodel.service, can both start normally, and the /etc/systemd/nvgetty.sh script can also run normally.

Can you help us confirm what is preventing the nvgetty.service from starting?

I have made some new progress. Can you help me analyze it?

Additional Findings
In the environment with the faulty solid-state drive, after the kernel has completed booting, the device cannot accept input via the serial port, but can be logged into via Telnet and used normally.
It has been confirmed that the console is using the device /dev/ttyTCU0. Using echo 123 > /dev/ttyTCU0 in the Telnet window shows that the serial port can output “123” normally.
The above issues are highly correlated with the failure to start nvgetty.service. Could it be that there is an error in my console configuration?

It should be expected result since it has been loaded during boot up.

Do you mean that the issue is specific to SSD?
(i.e. Would it work if you switch to use another SSD but the similar Orin Nano module and the board?)

For nvgetty.service, please also share the full /var/log/syslog on the failed case.

Sorry, I put this issue on hold due to other work for about a day, and now the fault cannot be reproduced (I confirmed that the same faulty SSD was used).

During this fault hold period, I made some minor changes, but I cannot recall all the details. Therefore, the analysis of this issue can only be temporarily halted.

Thank you very much for your guidance. If the fault reoccurs, I will investigate following the nv-l4t-bootloader-config.service → nvgetty.service → /var/log/syslog path. If there are any new developments, I will create a new topic and attach the link to this topic.

Good to hear that.
Thanks for updating these info to us.
There’re the logs to recording for services in /var/log/syslog.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.