Error during backup and restore script execution for Orion NX 16 GB on l4T ver 35.5 JP 5.1.3

Actually yesterday we tried restore on two different new orion nx units…both gave same error…

After trying with nvrestore_partitions.sh from 35.5 ( which has partx instead of partprobe command for parition table update) , I am getting slightly different error as shown below during restoring:

  • Step 3: Start the flashing process *
  •                                 *
    

Waiting for target to boot-up…
Waiting for target to boot-up…
Waiting for target to boot-up…
Waiting for device to expose ssh …Waiting for device to expose ssh …Device has booted into initrd. You can ssh to the target by the command:
$ ssh root@fc00:1:1:0::2
Cleaning up…
Log is saved to Linux_for_Tegra/initrdlog/flash_1-6_0_20240710-104814.log
Run command:
ln -s /proc/self/fd /dev/fd && mount -o nolock [fc00:1:1::1]:/home/trident/Downloads/r36_3_0/Linux_for_Tegra/tools/backup_restore /mnt && /mnt/nvrestore_partitions.sh -e nvme0n1 -n
on root@fc00:1:1::2
/mnt/images ~
nvrestore_partitions.sh: Use the default nvpartitionmap.txt as the index file.
40+0 records in
40+0 records out
20480 bytes (20 kB, 20 KiB) copied, 0.000446112 s, 45.9 MB/s
partition: none, disk: /dev/nvme0n1, lower: 0, upper: 0
Trying to use ‘/dev/loop0’ for the loop device
range recount: max partno=0, lower=0, upper=0
nvrestore_partitions.sh Restoring qspi0 with image QSPI0.img…
Erasing 65536 Kibyte @ 0 – 0 % complete
gzip: QSPI0.img: not in gzip format
0+0 records in
0+0 records out
0 bytes copied, 0.00187219 s, 0.0 kB/s
nvrestore_partitions.sh Restoring gpt_2 with image gptbackup.img…
33+0 records in
33+0 records out
16896 bytes (17 kB, 16 KiB) copied, 0.000390016 s, 43.3 MB/s
mke2fs 1.46.5 (30-Dec-2021)
The file /dev/nvme0n1p1 does not exist and no size was specified.
Erasing 65536 Kibyte @ 0 – 100 % complete
Copied 67108864 bytes from QSPI0.img to address 0x00000000 in flash
~
Error flashing non-qspi storage

Can you try pre-formatting the disk as ext4 before restoring?
Anyway, I don’t think the issue is about commands use for dealing with partition tables.
It’s still about your disk.

Ok. We will try that. But how come the earlier to orion NX modules, which had similar NVMEs got flashed/restored successfully with out throughout this error.
As per the hardware team, they did not format those NVME disks also before mounting them on the carrier board…

Just FYI… I tried flashing through initrd.sh also, and we are getting this error so looks like NVMe related problem.

Waiting for target to boot-up…
Waiting for target to boot-up…
Waiting for device to expose ssh …Waiting for device to expose ssh …Run command: flash on fc00:1:1:0::2
SSH ready
blockdev: cannot open /dev/mmcblk0boot0: No such file or directory
[ 0]: l4t_flash_from_kernel: Serial Number: 1422223012282
[ 0]: l4t_flash_from_kernel: Starting to create gpt for emmc
Active index file is /mnt/internal/flash.idx
Number of lines is 61
max_index=60
[ 1]: l4t_flash_from_kernel: Successfully create gpt for emmc
[ 1]: l4t_flash_from_kernel: Starting to create gpt for external device
Active index file is /mnt/external/flash.idx
Number of lines is 18
max_index=17
writing item=1, 9:0:primary_gpt, 512, 19968, gpt_primary_9_0.bin, 16896, fixed--0, f9cf89a5b629d86c1be5f2330b1cca8ee500d725
Error: Could not stat device /dev/nvme0n1 - No such file or directory.
Flash failure
Either the device cannot mount the NFS server on the host or a flash command has failed. Debug log saved to /tmp/tmp.P5xLOnDbyC. You can access the target’s terminal through “sshpass -p root ssh root@fc00:1:1:0::2”
Cleaning up…

@DaveYYY

Any solution to fix this issue?

we are not able to enter UEFI menu with ESC or F11 key press during booting…

I don’t think it’s about UEFI at all.
The issue is that your NVMe disk cannot be detected.
You should first get a new disk before proceeding.

We got it fixed now. Hardware guy did some touch ups or something on the carrier board where NVME was mounted and we are able to restore/flash successfully 2 more units now.

Sorry for the confusion. This was a completely different question not related to nvme restore/flashing error.
I wanted to know, how to make ESC/F11 work for entering UEFI while booting.

You can try using the keyboard connected to your host PC when you are monitoring through the serial console.

you mean we need to press ESC/F11 key on the key board connected to host pc(where we have opened the serial console) while jetson is booting?
instead of the keyboard connected to jetson orin NX unit

YES, exactly.
Usually both keyboards should work. I don’t know why you cannot get it with the one on the device.

Will try this on host PC keyboard and let you know.

If I am not wrong, I remember this similar issue being reported in some other thread, where UEFI menu does not come up, when we press ESC/F11 key on the jetson unit( which has custom carrier board).

note: for fyi. i have changed the priority order to NVMe using the command ( $sudo uefibootmgr -0 0008,000A… ) on the restored units, as these restored units ere trying to boot by http, pxe etc after restoration.

Now, the ESC key is working fine( earlier we were using USB hub to connect keyboard, now we are using USB 2.0 and USB 3.0 ports from SOM module directly ) and we are able to enter the UEFI BIOS screen.

But we have another peculiar problem once we enter UEFI mode by pressing ESC key, the screen start jumping to different windows automatically randomnly, even though we are not pressing any keys in the keyboard.
We see the same hapazard behaviour( moving to different menu, submenu, screens… automatically) even after we remove the keyboard from the USB port.

One more observation:

If we connect serial console then we see the BIOS screen is stable and working fine…no hapazard movement to different menu submenus .

Please let us know, how to fix this issue.

@KevinFFF @DaveYYY @WayneWWW

Please find the flicker issues in the attached video.

Without pressing any keys in the key board it is flickering…

Note: If we connect the serial port( console mode) pins with RX, TX and GND pins then we wont see this issue.

Kindly let us know what could be the issue.
its same issue in version 35.4.1 also.

Do you connect anything other than the keyboard to the device?

We have connected the mouse also…with the help of usb hub

I just tested locally and the screen is stable either with or without serial console connected on 36.3.
Can you please try that version?

We are testing on 36.3 only
We also tried on 35.4.1, there also same issue.

Are you testing on your development kit?

Are you using any level shifter ICs for UART debug port signals in your development kit?

YES.

NO.

It seems more like some inputs from serial/USB causing the interaction in UEFI.

Please remove all serial/USB devices on your board and check if there still the issue.

Since you won’t hit the issue after connecting the RX/TX/GND of debug UART, is there any custom design in this part different from the devkit?

In the dev kit there are using trans receiver( I am not sure, can you confirm !) (pull ups/pull downs etc ) within the dev kit board,
where as we are directly taking the uart tx/rx/ lines from SOM module( as per our hardware guy ) with our custom carrier board.

Might be this is causing the issue. Because as soon as we give power( using USB cable serial convertor connected to some host PC) to the external transreceiver connected to the UART debug console pins tx/rx/gnd, we see the issue stops.

Might be the voltage levels are floating on the tx/rx/ lines directly brought out from SOM module.
what is your opinion on this.

Thanks.