Xavier NX Devkit SD card potentially failed. How to reflash NVMe SSD with rootfs?

I have a Jetson Xavier NX Devkit with microSD card (P3668-0000 module), in a non-accessible place. No hardware manipulation is possible, and no keyboard inputs except predetermined ones through shell. This NX is connected to another host computer via USB and Ethernet connection. The host computer is also a Jetson Xavier NX Devkit but with eMMC.

The NX was unable to load the rootfs on SD card /dev/mmcblk0p1, resulting in these messages:

[ 13.280457] mmcblk0: error -110 transferring data, sector 13921832, nr 192, cmd response 0x900, card status 0x80b00
[ 13.803172] mmc0: Data timeout error
[ 13.803278] sdhci: =========== REGISTER DUMP (mmc0)===========
[ 13.803385] sdhci: Sys addr: 0x000000c0 | Version: 0x00000505
[ 13.803491] sdhci: Blk size: 0x00007200 | Blk cnt: 0x000000b8
[ 13.803599] sdhci: Argument: 0x00d46e28 | Trn mode: 0x0000003b
[ 13.803706] sdhci: Present: 0x01fb0000 | Host ctl: 0x00000017
[ 13.803814] sdhci: Power: 0x00000001 | Blk gap: 0x00000000
[ 13.803921] sdhci: Wake-up: 0x00000000 | Clock: 0x00000007
[ 13.804029] sdhci: Timeout: 0x00000008 | Int stat: 0x00000000
[ 13.804137] sdhci: Int enab: 0x02ff008b | Sig enab: 0x02fc008b
[ 13.804245] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000000
[ 13.804353] sdhci: Caps: 0x3f6cd08c | Caps_1: 0x18002f73
[ 13.804460] sdhci: Cmd: 0x0000123a | Max curr: 0x00000000
[ 13.804939] sdhci: Host ctl2: 0x0000300b
[ 13.805242] sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000ffefe420
[ 13.805742] sdhci: ===========================================

I have a copy of this remote setup in the lab and I tested a procedure to flash system.img onto NX’s /dev/mmcblk0p1. Here are the steps I’ve completed:

  1. Uploaded these packages to remote host computer:
  • arm_tegraflash_R32.7.1_aarch64.tbz2 (flashing tool and utilities for ARM-based flashing on the host computer)
  • Jetson_Linux_R32.6.1_aarch64_minimal.tar.gz (minimal Board Support Package for Jetson Xavier NX Devkit)
  • Debian packages needed to run flashing utility (qemu-user-static, libxml2-utils, uuid-runtime)
  1. Extract both Jetson_Linux_R32.6.1_aarch64_minimal.tar.gz and arm_tegraflash_R32.7.1_aarch64.tbz2 into …/Linux_for_Tegra.

  2. Use dd to image the entirety of host computer’s /dev/mmcblk0p1 onto …/Linux_for_Tegra/bootloader/system.img (the host computer is a Jetson Xavier NX Devkit with eMMC so we copy host computer’s rootfs).

  3. Put target NX into recovery mode and turn it on.

  4. Run sudo ./flash.sh -r jetson-xavier-nx-devkit mmcblk0p1on host computer

  5. Power cycle target NX and take it out of recovery mode.

When I run the above procedure in the lab I am able to flash the NX’s SD card successfully, boot the kernel and load the rootfs. However when I run this on my remote setup I still see the same error messages. This indicates that the remote NX’s SD card has a hardware I/O malfunction.

The target NX does contain a 1 TB ext4 NVMe SSD as /dev/nvme0n1 and would normally be mounted on the filesystem after boot.

My questions are:

1. Are there any operations I can do on the NX SD card to boot rootfs and potentially fix or recover the SD card partitions, such as boot from recovery rootfs? or use initial ramdisk to try to recover the SD card?

2. Could I flash the rootfs onto the remote NX’s NVMe SSD? How would I perform this operation?

3. Since the remote NX’s SD card might be damaged, is it advised to boot from the NVMe SSD? If so how would this be done?

Much Thanks!

Hi weiyu.tong,
Is it possible to put your remote device into recovery mode? This is relevant because in the flashing process you mentioned earlier, step 4 specifically involves setting the device into recovery mode.

For question 2 and 3, you can refer to the doc in Linux_for_Tegra/tools/kernel_flash/README_initrd_flash.txt :

356 Example 2: In this example, you want to boot Jetson Xavier NX SD from an
357 attached NVMe SSD. The SD card does not need to be plugged in. You can also
358 apply this if you don't want to use the emmc on the Jetson Xavier NX emmc.
....

Thanks

Thanks for the advice,

Yes, I’m able to put the target NX into recovery mode and I’ve been following the instructions in README_initrd_flash.txt Workflow 10 Example 2. Question 1: I’m using Jetson_Linux_R32.6.1, do you recommend uploading and upgrading to more recent releases of Jetson Linux?

I’m looking at the diffs between README_initrd_flash.txt from Jetson_Linux_R32.6.1 and Jetson_Linux_R35.6.0 and notice that the command for “generate a normal filesystem for the external device” has these differences:

$ sudo ./tools/kernel_flash/l4t_initrd_flash.sh --no-flash \
            --external-device nvme0n1p1 \
<<<            -S 8GiB -c ./tools/kernel_flash/flash_l4t_nvme.xml \
>>>            -c ./tools/kernel_flash/flash_l4t_external.xml \
            --external-only --append jetson-xavier-nx-devkit external

Do you know which one would be correct? Should be flash_l4t_nvme.xml or flash_l4t_external.xml?

I’ve followed the instructions in Workflow 10, Example 2 and had these errors,

cp: cannot stat '/mnt/ssd/xcomm_flash/Linux_for_Tegra/rootfs/usr/sbin/flash_erase': No such file or directory
Cleaning up...
Making system.img... 
.../Linux_for_Tegra/rootfs/boot/extlinux/extlinux.conf is not found, exiting...
Error: Failed to generate images for external device
Cleaning up...

So I copied the host computer’s /usr/sbin/* and /boot/extlinux into .../Linux_for_Tegra/rootfs before I could generate filesystem.

After generating filesystem ran successfully, I ran sudo ./tools/kernel_flash/l4t_initrd_flash.sh --flash-only, but this failed, and the flashing logs only showed this:


**********************************************
*                                            *
*  Step 1: Build the flashing environment    *
*                                            *
**********************************************
Create flash environment 0
/mnt/ssd/xcomm_flash/Linux_for_Tegra/bootloader /mnt/ssd/xcomm_flash/Linux_for_Tegra
/mnt/ssd/xcomm_flash/Linux_for_Tegra
Finish creating flash environment 0.
****************************************************
*                                                  *
*  Step 2: Boot the device with flash initrd image *
*                                                  *
****************************************************
/mnt/ssd/flashing_workspace/Linux_for_Tegra/temp_initrdflash/bootloader0 /mnt/ssd/flashing_workspace/Linux_for_Tegra
./tegraflash.py --bl nvtboot_recovery_cpu_t194_sigheader.bin.encrypt --bct br_bct_BR.bct --securedev  --applet rcm_2_encrypt.rcm --applet_softfuse rcm_1_encrypt.rcm --cmd "rcmboot"  --cfg secureflash.xml --chip 0x19 --mb1_bct mb1_bct_MB1_sigheader.bct.encrypt --mem_bct mem_rcm_sigheader.bct.encrypt --mb1_cold_boot_bct mb1_cold_boot_bct_MB1_sigheader.bct.encrypt --mem_bct_cold_boot mem_coldboot_sigheader.bct.encrypt  --bins "mb2_bootloader nvtboot_recovery_t194_sigheader.bin.encrypt; mts_preboot preboot_c10_prod_cr_sigheader.bin.encrypt; mts_mce mce_c10_prod_cr_sigheader.bin.encrypt; mts_proper mts_c10_prod_cr_sigheader.bin.encrypt; bpmp_fw bpmp_t194_sigheader.bin.encrypt; bpmp_fw_dtb tegra194-a02-bpmp-p3668-a00_sigheader.dtb.encrypt; spe_fw spe_t194_sigheader.bin.encrypt; tlk tos-trusty_t194_sigheader.img.encrypt; eks eks_sigheader.img.encrypt; kernel boot0.img; kernel_dtb kernel_tegra194-p3668-all-p3509-0000.dtb; bootloader_dtb tegra194-p3668-all-p3509-0000_sigheader.dtb.encrypt"    --ramcode 1  --instance 1-2.2
Welcome to Tegra Flash
version 1.0.0
Type ? or help for help and q or quit to exit
Use ! to execute system commands
 

 Entering RCM boot

[   0.0000 ] rcm boot with presigned binaries
[   0.0000 ] Boot Rom communication
[   0.0030 ] tegrarcm_v2 --instance 1-2.2 --chip 0x19 0 --rcm rcm_1_encrypt.rcm --rcm rcm_2_encrypt.rcm
[   0.0056 ] B

Question 2: Do I need to generate a normal filesystem for the external device (as indicated in Workflow 10, Example 2, Step 3) if I’ve already imaged the host computer’s /dev/mmcblk0p1 and wrote it to system.img?

Question 3: Do I need to power cycle and actively put the target NX in recovery mode between each step or can I put the device into recovery mode once at the start of the steps all the way through the completion of step 3?

Hi weiyu.tong,
We recommend upgrading to version r35.6.0 and using the r35.6.0 version command.

sudo ./tools/kernel_flash/l4t_initrd_flash.sh --no-flash \
            --external-device nvme0n1p1 \
            -c ./tools/kernel_flash/flash_l4t_external.xml \
            --external-only --append jetson-xavier-nx-devkit external

About question 2 please refer to the Quick Start: r35.6

It appears you may have skipped some essential commands, resulting in missing files. Please execute the following commands before proceeding to the Example2 step.

cd Linux_for_Tegra/
sudo ./apply_binaries.sh
sudo ./tools/l4t_flash_prerequisites.sh

For question 3, yes put in recovery mode in each step

Hi David,

I upgraded to Jetson_Linux_R32.6 and followed all of the instructions from Quick Start — NVIDIA Jetson Linux Developer Guide 1 documentation

I followed the instructions in Workflow 11 Example 2 of README_l4t_initrd_flash.txt and also modified flash_l4t_external.xml so that device type="nvme" and I changed "NUM_SECTORS" to the correct number of 512 kb sectors on my NVMe SSD.

For reference this is what I followed:

Example 2: In this example, you want to boot Jetson Xavier NX SD from an
attached NVMe SSD. The SD card does not need to be plugged in. You can also
apply this if you don't want to use the emmc on the Jetson Xavier NX emmc.

1. Put the device into recovery mode, then generate qspi only images
for the internal device:
$ sudo ./tools/kernel_flash/l4t_initrd_flash.sh --no-flash jetson-xavier-nx-devkit-qspi internal

Note: The board name given here is not jetson-xavier-nx-devkit or
jetson-xavier-nx-devkit-emmc so that no SD card or eMMC images are generated.


2. Put the device into recovery mode, then generate a normal
filesystem for the external device:
$ sudo ./tools/kernel_flash/l4t_initrd_flash.sh --no-flash \
            --external-device nvme0n1p1 \
            -c ./tools/kernel_flash/flash_l4t_external.xml \
            --external-only --append jetson-xavier-nx-devkit external

3. Put the device into recovery mode, then flash both images:
$ sudo ./tools/kernel_flash/l4t_initrd_flash.sh --flash-only

Unfortunately, this did not succeed and I received these errors during the flashing process:

Error: Could not stat device /dev/mmcblk0 - No such file or directory.

Error flashing non-qspi storage

When I attempted to boot the NX, I receive these errors when kernel_boot_app is started:

  I/TC: Reserved shared memory is disabled
I/TC: Dynamic shared memory is enabled
I/TC: Normal World virtualization support is disabled
I/TC: Asynchronous notifications are disabled
I/TC: WARNING: Test UEFI variable auth key is being used !
I/TC: WARNING: UEFI variable protection is not fully enabled !

  E/TC:? 0 get_rpc_alloc_res:645 RPC allocation failed. Non-secure world result: ret=0xffff0000 ret_origin=0
E/LD:  init_elf:486 sys_open_ta_bin(bc50d971-d4c9-42c4-82cb-343fb7f37896)
E/TC:? 0 ldelf_init_with_ldelf:131 ldelf failed with res: 0xffff000c

Is it possible to put everything on the external NVMe SSD (bootloader, kernel, rootfs, APP, and other partitions) and boot from it? I’d like to find a way to not use the internal SD card at all.

Hi,
Sorry for the late response.
Do the issue solved?
If not, Please attach your flashing log and the serial console log for us to review.
About serial console log, you could refer to below link

Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.