AGX Xavier fails to boot from eMMC - Kernel panic

Dear NVidia Team

We have a custom Xavier AGX system with JetPack 4.6 on the eMMC, that fails to boot. Here is the log file:
log_boot_loop.txt (49.1 KB)

Booting from an external USB, we checked the mmc health:
mmc_log.txt (10.3 KB)

In similar cases, you suggest to RMA the device:

Before doing so, we would like to know if we can check anything further to find out, why the system came into this status?
Also, if we reflash the device and it boots from the eMMC, how can we make sure that the eMMC is okay?
Thank you.

Best regards

I can’t answer that, but I can give you a very important test: Try to clone. The carrier board shouldn’t matter during a clone. If it succeeds, then it can at least read the eMMC for the rootfs partition. Success also means a backup of exactly what the rootfs has at the time of failure. Clones give both a “raw” clone (bit-for-bit exact copy of the partition) and a “sparse” clone (does not contain empty space, faster to flash with, but not useful for examination…as the filesystem fills the size of the sparse clone approaches the size of the raw clone).

If a clone succeeds or fails it tells you a lot about the health of the eMMC across at least part of its address range. Just be sure to monitor “dmesg --follow” during the clone on the host PC in case of errors since it might add clues.

Hi linuxdev

Thank you for your answer.
We tried to clone the image. Reading the eMMC was successful, even if it felt really slow. But when we try to flash the read image to another device, it gets stuck at 0% writing the APP partition. In the dmesg we only get the message that the tegraflash process was blocked for more than 120 seconds.

Should we just try to reflash the AGX?

You can also try jetpack5.1.1 to clarify if this is really related to emmc health.

Hi WayneWWW

We tried flashing the device with JetPack 5.1 without success, we get the error:

NV3P_SERVER: Failed to initialize partition table from GPT.

Anything else we can try or do we have to RMA the AGX Xavier?

Is your test based on nv devkit or custom board?

It is based on a custom board.

Incidentally, a cloned image should flash to any Jetson regardless of whether the image is valid or not (though it is possible that the size of the image can cause errors if a different size is expected). It could be random bits, all NULL bytes, so on, and the flash would work (it sure would not boot though). The fact that it won’t flash to another unit says something is wrong with the hardware or firmware. I don’t think device trees matter a lot during flash, so I think odds are that it is a hardware design error shared on the different Jetsons. About the only other possibility is if your flash software is corrupt, or perhaps if you are using a VM.

I would suggest do you checking on devkit if you hope RMA.

We flash from the same host many other AGX Xavier with the same carrier board without any issues. Ok WayneWWW, we will take out the module and test it with the DevKit, even though the result probably will be the same.

Hi WayneWWW

We tried now flashing the AGX Xavier on the DevKit with the same outcome:

NV3P_SERVER: Failed to initialize partition table from GPT.

Should we RMA the device?

Yes, please RMA device.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.