Jetson Orin Nano Developmen Kit - NVMe flash fails in every possible way

Hi,

is apply-binaries-sh.txt a full log of what is shown in terminal after you run sudo ./apply_binaries.sh? It should install way more packages than what your log gives. Did you delete the original folder and have a freshly downloaded-extracted BSP?

Can you show us what is inside your Linux_for_Tegra/rootfs/lib/modules/5.10.104-tegra/kernel/drivers folder?

Thanks for your correspondence.

The apply-binaries-sh.txt was indeed the full log.

I have downloaded both packages again from:
https://developer.nvidia.com/downloads/embedded/l4t/r35_release_v3.1/release/jetson_linux_r35.3.1_aarch64.tbz2/

…with…

md5sum Jetson_Linux_R35.3.1_aarch64.tbz2
1d254bec244e1bdeaa4012f02e2c6057  Jetson_Linux_R35.3.1_aarch64.tbz2

and:

https://developer.nvidia.com/downloads/embedded/l4t/r35_release_v3.1/release/tegra_linux_sample-root-filesystem_r35.3.1_aarch64.tbz2/

…with…

md5sum Tegra_Linux_Sample-Root-Filesystem_R35.3.1_aarch64.tbz2
12bd3ed977eb86f6b13e90a2000022de  Tegra_Linux_Sample-Root-Filesystem_R35.3.1_aarch64.tbz2

This time I figured out, that when first time, after unpacking the archives freshly the log mildly differs:

230517_apply-binaries_dif.txt (1.8 KB)

Following your suggestion to ls -al Linux_for_Tegra/rootfs/lib/modules/5.10.104-tegra/kernel/drivers we indeed found something unexpected - namely that:

supergrobi@OS:~/nvidia$ ls -al Linux_for_Tegra/rootfs/lib/modules/5.10.104-tegra/kernel/drivers
ls: cannot access 'Linux_for_Tegra/rootfs/lib/modules/5.10.104-tegra/kernel/drivers': No such file or directory
supergrobi@OS:~/nvidia$ ls Linux_for_Tegra/rootfs/lib/modules
ls: cannot access 'Linux_for_Tegra/rootfs/lib/modules': No such file or directory

Could this be a hint to what is going wrong in here?
Further - permissions are okay like that:

supergrobi@OS:~$ ls -al nvidia/Linux_for_Tegra/rootfs
total 16
drwxr-xr-x 1 root       root        150 May 17 08:51 .
drwxr-xr-x 1 supergrobi supergrobi 4052 Mar 19 15:23 ..
-rw-r--r-- 1 supergrobi supergrobi   62 Mar 19 15:14 README.txt
lrwxrwxrwx 1 root       root          7 Jul 31  2020 bin -> usr/bin
drwxr-xr-x 1 root       root         16 May 17 08:51 boot
drwxr-xr-x 1 root       root         26 May 17 08:51 dev
drwxr-xr-x 1 root       root       3882 Mar  1 03:21 etc
drwxr-xr-x 1 root       root          0 Apr 15  2020 home
lrwxrwxrwx 1 root       root          7 Jul 31  2020 lib -> usr/lib
drwxr-xr-x 1 root       root          0 May 19  2021 media
drwxr-xr-x 1 root       root          0 Jul 31  2020 mnt
drwxr-xr-x 1 root       root         12 May 17 08:51 opt
drwxr-xr-x 1 root       root          0 Apr 15  2020 proc
drwx------ 1 root       root         42 Sep 20  2022 root
drwxr-xr-x 1 root       root        416 Jul  7  2021 run
lrwxrwxrwx 1 root       root          8 Jul 31  2020 sbin -> usr/sbin
drwxr-xr-x 1 root       root          0 Feb  2  2021 snap
drwxr-xr-x 1 root       root          0 Jul 31  2020 srv
drwxr-xr-x 1 root       root          0 Apr 15  2020 sys
drwxrwxrwt 1 root       root          0 Mar  1 03:21 tmp
drwxr-xr-x 1 root       root         84 Mar  4  2021 usr
drwxr-xr-x 1 root       root        122 Mar  4  2021 var

That is really weird…
Absence of the lib/modules means your system is no more than a pure Ubuntu, with no NVIDIA stuff in it.
Are you sure sudo ./apply_binaries.sh end correctly without any errors? Also try if this post helps.
If downloading and extracting the BSP package does not solve the issue, can you please try to use a different host PC?

Thanks David - this helped a little - but not fully yet.

sudo ./apply_binaries.sh

resultet in great success:
230517_apply-binaries_suc.txt (12.7 KB)

tough the flashing procedure…

sudo ./tools/kernel_flash/l4t_initrd_flash.sh --external-device mmcblk1p1 -c tools/kernel_flash/flash_l4t_external.xml -p "-c bootloader/t186ref/cfg/flash_t234_qspi.xml" --showlogs --network usb0 jetson-orin-nano-devkit internal

…looks different than usual - but still fails:
230517_sudo-flash-sh-jetson-orin-nano_2.txt (246.0 KB)

sudo bash ./flashcmd.txt

230517_bootl-sudo-flashcmd.txt (1.6 KB)

Hi,

sorry for the late reply. Can you attach log dumped from UART when flashing the board?

@m.j.kurz What is the host system? It seems NVMe flashing is just broken out of the box for Ubuntu 20. I faced yet another failure mode flashing to NVMe.

A friend has successfully flashed with Ubuntu 18 but we need to repeat it on another machine to be sure if the OS is truly the issue.

Also, try this as a workaround if everything else fails

sudo ./tools/kernel_flash/l4t_initrd_flash.sh --external-device nvme0n1 -c ./tools/kernel_flash/flash_l4t_nvme.xml -S 200GiB --showlogs --erase-all jetson-orin-nano-devkit nvme0n1p1

Fails in the same manner I have seen in SDKM, citing permissions issues with RTELink. Why is this so broken on Ubuntu 20? It seems Ubuntu 18 works fine:

Hello Dave :),
please find the serial-log attached:
230523_serial.log (30.2 KB)

Hi ss332,
sorry for the late reply from my side and thanks for your correspondence. So regarding the host systems that I have tried:

  • Ubuntu 18
  • Ubuntu 20
  • NixOS (with Ubuntu 18 or 20 in Docker - for the Jetson Nano this setup worked perfectly fine).

Hi,

Looks like the flashing process was interrupted as your device entered the UEFI Shell.
Make sure UEFI Shell is not in the first place in boot order, and unplug all peripherals (keyboard, mouse, USB flash drive, etc.) from your device and flash again.

Hello,

so I have tried the whole process now without any peripherals attached (keyboard, mouse, USB flash drive, Ethernet etc).

230601_ab_flash_jet_or_n.log (256.5 KB)

The flash seems to succeed and seems to provide an open ssh connection.
The IP is not requested from the DHCP server (that I have configured on my Notebook) - but set static from a previous flash (192.168.0.1).

ssh root@192.168.0.1

…does not respond.

Restarting the device and showing the logs over the serial monitor:
230523_serial.log (30.2 KB)
…shows me again, that the normal boot is interrupted by UEFI. As I try to manually exchange the boot order in the UEFI-menu (haahaa), I see, that the NVMe should be the first to boot, then comes IPv4, then IPv6 and so on.

What seems miraculous in that context is, that even though I flashed the device now multiple times - the boot-loader remains the UEFI with the previously set configurations.

Is there any way to flash the device even harder?

Thanks in advance and have a great day.

I don’t quite get it with this part, and can you please elaborate it more?
You should be able to ssh into 192.168.55.1 if you connect the device to your host PC with a micro USB cable.

So does manually changing boot order work?

Thanks for asking these questions - I got it wrong before and tried to establish a connection through Ethernet-SSH.

Unfortunately it seems, that my Host-PC is not getting the CDC_USB up. One issue I figured out already was, that I didn’t have TPM2 activated on my host. After activating I still get:

usb usb2-port1: config error

…when dmesg --follow and plugging in the Jetson USB.

Regarding the UEFI shell - yes - I could change the boot-order:

The serial log though stays basically the same (not booting into NVMe - but coming up in the end with the UEFI-shell).

Did dmesg give you things in addition to this when the connection failed?
Have you tried with a different USB cable/port?

I’d suggest you use UART or real Ethernet to log into your device if the USB method is temporarily unavailable.

Did dmesg give you things in addition to this when the connection failed?
Have you tried with a different USB cable/port?

dmesg --follow-new gives the same error (and only that):
usb usb2-portX: config error
…on all USB-ports.

I’d suggest you use UART or real Ethernet to log into your device if the USB method is temporarily unavailable.

UART works fine. The main issue - that even after flashing the devices NVMe, it does not boot, but just opens the promp to press either ESC, F11 or Enter to either [go into boot-menu], [continue normal boot], [open UEFI-shell] - persists.

The IP is not requested from the DHCP server (that I have configured on my Notebook) - but set static from a previous flash (192.168.0.1).

I don’t quite get it with this part, and can you please elaborate it more?

To elaborate further on that: In one of my previous efforts I have set the IP in the UEFI-boot-menu to 192.168.0.1.
After that - I have made a series of attempts to flash the devices NVMe with the, in the previous comments mentioned methods. Further I have set up a DHCP-server on my host - so when connecting the physical Ethernet cable - the Jetson should request an IP from the DHCP.

In practice, this does not happen - and the IP remains 192.168.0.1 (its possible to ping) - and trying to connect with ssh root@192.168.0.1 fails.

Hi, sorry that I just noticed the flashing process was not complete.
It booted into initrd kernel, and rootfs was to be flashed via SSH, which should be established with the USB cable.

Did you get anything after

Waiting for device to expose ssh …RTNETLINK answers: File exists
RTNETLINK answers: File exists
Waiting for device to expose ssh …Run command: flash on fc00:1:1:0::2
SSH ready

, or you thought it had completed so you removed the cable?

I indeed thought it had completed - though trying it again it basically adds up:

Flash failure
Cleaning up...

…in the end of the log.

20230609_flsh_jet.log (247.2 KB)

Is the command:

sudo ./tools/kernel_flash/l4t_initrd_flash.sh --external-device mmcblk1p1 -c tools/kernel_flash/flash_l4t_external.xml -p "-c bootloader/t186ref/cfg/flash_t234_qspi.xml" --showlogs --network usb0 jetson-orin-nano-devkit internal

…(in specific - based on the assumption, that eth-over-usb does not work - --network usb0) correct?

Thanks a lot for your efforts.

Hi,

sorry for the late discovery, but I just noticed that you said you want to flash the NVMe SSD on the title of this post, but you used mmcblk1p1 instead of nvme0n1p1 in the flashing command…

Can you help to confirm it it’s the root cause?

1 Like

Hello hello,

unfortunately I can not confirm that as the root-cause.

sudo ./tools/kernel_flash/l4t_initrd_flash.sh --external-device nvme0n1p1 -c tools/kernel_flash/flash_l4t_external.xml -p "-c bootloader/t186ref/cfg/flash_t234_qspi.xml" --showlogs --network usb0 jetson-orin-nano-devkit internal

…results as well in:
20230612_jetsonlog.log (246.2 KB)

Can we find out, if the NVMe is working together correctly with the DevKIT?

And a side-quest(ion): is there a tutorial available, on how to boot the Jetson from network / or how to flash it with a custom image over eth?

Hi,

what do you get from UART currently?