Different Xavier NX Modules Exhibit Inconsistent Flashing Behavior

Good morning! I am trying to create a workflow for our product that will allow the massflashing of our OS and BSP to multiple Jetson Xavier NX boards at once. I am testing on the nVidia Jetson Xavier NX Devkit with Production Xavier NX modules (eMMC only, no SD card).

We are currently using JetPack 4.6 and L4T 32.6.1 as we have custom drivers from our vendors developed for this platform version. For these flashing tests I am using nVidia provided rootfs and unmodified sources and drivers as well as the compatibility overlay for Xavier NX production modules from this link

We want to use an NVMe SSD as our primary boot device, therefore we are exploring the use of nVidia’s provided massflash utility utilizing initrdflash. While exploring this methodology we have encounterd a problem. Jetson Xavier NX modules with 699 level part numbers ending in 301 are compatible with the massflash blob we generated but Jetson Xavier NX modules with 699 level part numbers ending in 300 flash successfully, however they fail to boot after flashing completes.

As noted in the Jetson_Xavier_NX_PCN_206980_DRAM_eMMC_Manufacturing_BOM_Expansion_v3.pdf (630.6 KB), Any Jetson Xavier NX module with 699 level part number version greater than or equal to 301 may be affected by this PCN where new Hynix memory and eMMC parts were added to the BOM of Jetson Modules. This PCN notes that JetPack 4.4.1 / BSP 32.4.4 or later releases of Linux for Tegra (L4T) will include the required changes. If this is true then why are we experiencing failure to boot on some of our Jetson Xavier NX modues with Jetpack 4.6 and L4T 32.6.1?

These are the two Jetson Xavier NX modules in question:

If you scan the QR code, the 699 level part number can be seen and we can tell that one of the modules has a part number ending in 300 and one ending in 301.

The UART logs of the boot of module with the 300 part number after flashing hangs at the following:

[0001.865] -----------------------------------------------
[0001.871] Synchronous Exception: DATA ABORT (FAR: 0)
[0001.876] -----------------------------------------------
[0001.881] PAR_ELX: 0x80f
[0001.883] 
[0001.884] ESR 0x96000007: ec 0x25, il 0x1, iss 0x7
[0001.889] -----------------------------------------------
[0001.894]  [Stack Trace]
[0001.896] 
[0001.898] => pc:0xA063EE58, sp:0xA0695AA0
[0001.901] => pc:0xA060F384, sp:0xA0695CE0
[0001.905] => pc:0xA060F520, sp:0xA0695D80
[0001.909] => pc:0xA060F534, sp:0xA0695DF0
[0001.913] => pc:0xA060F1B8, sp:0xA0695E30
[0001.917] => pc:0xA060E72C, sp:0xA0695E40
[0001.921] => pc:0xA060E700, sp:0xA0695E50
[0001.925] -----------------------------------------------
[0001.930] iframe 0xa06959b0:
[0001.933] x0  0x               0 x1  0x        a0695dc0 x2  0x        a0695dc8 x3  0x        a0695dd0
[0001.942] x4  0x               0 x5  0x               0 x6  0x               0 x7  0x               0
[0001.951] x8  0x              43 x9  0x               a x10 0x        a0695e60 x11 0x            a188
[0001.960] x12 0x        a06901a0 x13 0x               0 x14 0x               0 x15 0x               0
[0001.969] x16 0x               0 x17 0x               0 x18 0x               0 x19 0x               0
[0001.978] x20 0x        a0695dc8 x21 0x               0 x22 0x        a065fc76 x23 0x        a0695dc0
[0001.987] x24 0x        a0695dd0 x25 0x        a0695e00 x26 0x               0 x27 0x               0
[0001.996] x28 0x               0 x29 0x        a0695ce0 lr  0x        a063ee48 sp  0x        a0695aa0
[0002.005] elr 0x        a063ee58
[0002.008] spsr 0x        20000209
[0002.012] -----------------------------------------------
[0002.017] panic (caller 0xa0601238): die
[0002.021] HALT: spinning forever...

While the module with the 301 part number is able to fully boot.

I have attached the UART logs of the 300 and 301 modules booting after flash as
xavier_300_failed_postflash_UART.log (19.5 KB) and
xavier_301_success_postflash.log (40.0 KB)

I have also attached the logs from the initrd kernel flashing for the 300 and 301 modules as
flash_1-4_0_20220811-092134_300.log (21.3 KB) and
flash_1-4_0_20220811-095302_301.log (21.3 KB)

How can I make a single massflash blob that is compatbile with both module types?

Thanks!

Hi,

Back to the basic question here. Will it work if you don’t use massflash tool but flash.sh?

Since I had created a mass-flash enviroment it seems I don’t have flash.sh in this folder to try. Going back to the JetPack_4.6_Linux_JETSON_XAVIER_NX_TARGETS folder, I first tried flashing using the SDK Manager GUI and then I tried using flash.sh. This is the same folder and files I used to generate the massflash folder from.

When using SDK Manager and the GUI to flash to NVMe SSD only with JetPack 4.6 rev 3, seems like both varients of the Xavier NX complete flash and are able to boot without a problem.

boot_300_success_UART.log (39.4 KB)
boot_301_success_UART.log (39.4 KB)
flash_GUI_300_success.log (64.0 KB)
flash_GUI_301_success.log (64.0 KB)

Next I tried manually flashing using flash.sh without the SSD. I ran sudo sudo ./flash.sh jetson-xavier-nx-devkit-emmc mmcblk0p1

This seems to have worked for both the 301 and 300 varients of the production Jetson Xavier NX, however, I am trying to validate a workflow that will let us flash multiple units at a time to both the SSD and the internal storage devices, so a single flash will not be good enough.

Hi,

Another question here. When you create the massflash blob, did you create it for each kind of module, or you just use one blob for every module?

I mean, did you create 300 for one file and 301 for another?

We are trying to create a single blob that will work for both 300 and 301 as we recieve both units mixed in a single shipment and we don’t want to have sepearate processes for identifying and sorting out Xavier NX modules.

Hi,

Please try the separate blob can work or not first.

I will try and update tomorrow. Is there no other option for me to try? We really don’t want to have two processes for different Xavier NX varients for massflash.

BTW, is this rel-32.6.1 or rel-32.7.1?

BTW, is this rel-32.6.1 or rel-32.7.1?

We are using release 32.6.1. @WayneWWW

Better trying to use 32.7.1 to test.

I will try it. My apologies for the lack of updates, I will get around to implementing your suggestions and I will update this thread by end of week

For now we need to use release L4T 32.6.1, I have connected my 300 varient model xavier nx to host pc and generated mfi files using the module online method. I repeated the generation of mfi files with 301 varient model xavier nx connected to the host pc.

For now we will have a process to identify which Xavier NX we are trying to massflash and we will use the proper MFI file corresponding to the correct module.

I will look into automating this process on 32.6.1 at a later time.

This is how I generated the MFI: sudo ./tools/kernel_flash/l4t_initrd_flash.sh --no-flash --massflash 6 --external-device nvme0n1 -c ./tools/kernel_flash/flash_l4t_nvme_250.xml -S 225GiB jetson-xavier-nx-devkit-emmc nvme0n1p1

Then please add this patch if you need to stay on rel-32.6.1.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.