Jetson nano运行一段时间后,无法启动

jetson nano modules;

R32 (release), REVISION: 4.2, GCID: 20074772, BOARD: t210ref, EABI: aarch64, DATE: Thu Apr 9 01:22:12 UTC 2020


设备运行一段时间后,无法正常启动,会卡在Start Kernel…。

启动log:
U-Boot 2016.07 (Sep 25 2020 - 12:43:24 +0800)

TEGRA210
Model: NVIDIA P3450-Porg
Board: NVIDIA P3450-PORG
DRAM: 4 GiB
MMC: Tegra SD/MMC: 0, Tegra SD/MMC: 1
*** Warning - spi_flash_probe() failed, using default environment

In: serial
Out: serial
Err: serial
Net: No ethernet found.
Card did not respond to voltage select!
** Bad device mmc 1 **
Hit any key to stop autoboot: 0
Card did not respond to voltage select!
switch to partitions #0, OK
mmc0(part 0) is current device
Scanning mmc 0:1…
Found /boot/extlinux/extlinux.conf
Retrieving file: /boot/extlinux/extlinux.conf
795 bytes read in 74 ms (9.8 KiB/s)
1: primary kernel
Retrieving file: /boot/initrd
5487776 bytes read in 228 ms (23 MiB/s)
Retrieving file: /boot/Image
34332680 bytes read in 807 ms (40.6 MiB/s)
append: tegraid=21.1.2.0.0 ddr_die=4096M@2048M section=512M memtype=0 vpr_resize usb_port_owner_info=0 lane_owner_info=0 emc_max_dvfs=0 touch_id=0@63 video=tegrafb no_console_suspend=1 console=ttyS0,115200n8 debug_uartport=lsport,2 earlyprintk=uart8250-32bit,0x70006000 maxcpus=4 usbcore.old_scheme_first=1 lp0_vec=0x1000@0xff780000 core_edp_mv=1075 core_edp_ma=4000 tegra_fbmem=0x140000@0x92cb0000 is_hdmi_initialised=1 root=/dev/mmcblk0p1 rw rootwait rootfstype=ext4 console=ttyS0,115200n8 console=tty0 fbcon=map:0 net.ifnames=0 sdhci_tegra.en_boot_part_access=1 root=/dev/mmcblk0p1 rw rootwait rootfstype=ext4 console=ttyS0,115200n8 console=tty0 fbcon=map:0 net.ifnames=0 sdhci_tegra.en_boot_part_access=1 quiet loglevel=0 vt.global_cursor_default=0 console=tty1 fbcon=map:1

Flattened Device Tree blob at 83100000

Booting using the fdt blob at 0x83100000
reserving fdt memory region: addr=80000000 size=20000
Using Device Tree in place at 0000000083100000, end 00000000831777fb

Starting kernel …

session.log (46.2 KB)

Hi cuiep,

Are you using the devkit or custom board for Jetson Nano?
Could you update to latest R32.7.3 to verify?

[0007.760] I2C slave not started
[0007.763] I2C write failed
[0007.766] Writing offset failed
[0007.769] eeprom_init: EEPROM read failed
[0007.773] pm_ids_update: eeprom init failed

It seems your EEPROM could not be read when the error occurs.

Do you mean that this issue happen occasionally?

How do you flash the board?

yeah, This happen occasionally。And if I flash it,it will start normal.
Now the equipment is sent to the customer and once this problem occurs, the system must be returned to the factory. Is there any way to identify the cause of this problem?

I flash the board use the follow kit:
sudo ./flash.sh -r APP jetson-nano-emmc mmcblk0p1

This is normal start log:


session.log (24.5 KB)

So, are you using the custom carrier board without EEPROM?

Have you tried using the following command to flash the whole board?

sudo ./flash.sh jetson-nano-emmc mmcblk0p1

Is there only one device with this boot up issue?

  1. Yeah, *We only used the module’s internal storage on my custom carrier.
  2. About
    “sudo ./flash.sh jetson-nano-emmc mmcblk0p1”, because I customized our own application configuration on the original rootfs, so we used clone and restore methods to batch.
    • There will be one after a while,now it is about five device.

Could you just get one board for debug to check whether flashing the whole board with default BSP package and rootfs would hit this issue or not?

Is there any board work without this issue?
You could try to do reboot stress test.

1.Not every piece of equipment will show up, and the timing is uncertain, it could take months.
2. The current whth this issue figure is less than 1%.
3. I can re falsh one board with this issue to sress test. But not necessarily again.
4.Is there any way to check the contents of emmc to troubleshoot the problem?

I just want to clarify if the custom rootfs causing this issue.
Or there’s the damage to the eMMC.
Would it recover after re-flash?

I just want to clarify if the custom rootfs causing this issue.
Or there’s the damage to the eMMC.
Would it recover after re-flash?
——————————————————
It’s recover to normal after re-flash.

Would it still happen after using it for a while?

This is a probability event。

請問這是每顆module都會發生還是只有單一module?
還有devkit能否複製這個問題?

已經出現了6塊了。

ok. 那請問這個問題?

還有devkit能否複製這個問題?

devkit沒有復現過。目前出現的概率大約1%。但是一旦出現,就是致命的,必須把產品返廠重新燒寫module才行。

我手上只有兩款devkit,也比較難復現。是否有辦法查看已經起不來的module,方便排查問題

你所謂的已經起不來的module是每次都會卡在staring kernel的意思嗎

是的。每次都卡在starting kernel.

請問你目前能做的測試有哪些?

比方說你能把這些"已經起不來的module"直接放到devkit做開機嗎
比方說你能重燒嗎?