Flashing Orin from inside Docker container

Hi,

We need to use Docker to create a flashing setup. I’ve used Flashing Orin NX using custom docker container - #4 by benyamin as an example and got up to the point where the target boots and even got to flash the NVME, but that worked only once and then failed as ssh somehow timed out while the QSPI was being erased:

Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for device to expose ssh ......Waiting for device to expose ssh ...Run command: flash on fc00:1:1:0::2
SSH ready
blockdev: cannot open /dev/mmcblk0boot0: No such file or directory
[ 0]: l4t_flash_from_kernel: Serial Number: 1421823001379
[ 0]: l4t_flash_from_kernel: Starting to create gpt for emmc
Active index file is /mnt/internal/flash.idx
Number of lines is 61
max_index=60
[ 1]: l4t_flash_from_kernel: Successfully create gpt for emmc
[ 1]: l4t_flash_from_kernel: Starting to create gpt for external device
Active index file is /mnt/external/flash.idx
Number of lines is 22
max_index=21
writing item=1, 9:0:primary_gpt, 512, 19968, gpt_primary_9_0.bin, 16896, fixed-<reserved>-0, 977155b6b1ea5b0415af6d03a5ed3ba520cab76b
Writing primary_gpt partition with gpt_primary_9_0.bin
Offset is not aligned to K Bytes, no optimization is applied
dd if=/mnt/external/gpt_primary_9_0.bin of=/dev/nvme0n1 bs=1 skip=0  seek=512 count=16896
16896+0 records in
16896+0 records out
16896 bytes (17 kB, 16 KiB) copied, 0.032323 s, 523 kB/s
Writing primary_gpt partition done
Error: The backup GPT table is corrupt, but the primary appears OK, so that will be used.
Warning: Not all of the space available to /dev/nvme0n1 appears to be used, you can fix the GPT to use all of the space (an extra 380580528 blocks) or continue with the current setting? 
Writing secondary_gpt partition with gpt_secondary_9_0.bin
Offset is not aligned to K Bytes, no optimization is applied
dd if=/mnt/external/gpt_secondary_9_0.bin of=/dev/nvme0n1 bs=1 skip=0  seek=61203267072 count=16896
16896+0 records in
16896+0 records out
16896 bytes (17 kB, 16 KiB) copied, 0.0283855 s, 595 kB/s
Writing secondary_gpt partition done
Fix/Ignore? Fix                                                           
Warning: Not all of the space available to /dev/nvme0n1 appears to be used, you can fix the GPT to use all of the space (an extra 380580528 blocks) or continue with the current setting? 
Model: TS256GMTE110S (nvme)
Disk /dev/nvme0n1: 256GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 

Number  Start   End     Size    File system  Name                Flags
 2      20.5kB  134MB   134MB                A_kernel            msftdata
 3      134MB   135MB   786kB                A_kernel-dtb        msftdata
 4      135MB   168MB   33.2MB               A_reserved_on_user  msftdata
 5      168MB   302MB   134MB                B_kernel            msftdata
 6      302MB   303MB   786kB                B_kernel-dtb        msftdata
 7      303MB   336MB   33.2MB               B_reserved_on_user  msftdata
 8      336MB   420MB   83.9MB               recovery            msftdata
 9      420MB   421MB   524kB                recovery-dtb        msftdata
10      421MB   488MB   67.1MB               esp                 msftdata
11      488MB   572MB   83.9MB               recovery_alt        msftdata
12      572MB   572MB   524kB                recovery-dtb_alt    msftdata
13      572MB   639MB   67.1MB               esp_alt             msftdata
14      639MB   1059MB  419MB                UDA                 msftdata
15      1059MB  1562MB  503MB                reserved            msftdata
16      1562MB  1687MB  126MB   fat16        resin-boot          boot, esp
17      1687MB  2438MB  751MB   ext4         resin-rootA         msftdata
18      2438MB  3189MB  751MB   ext4         resin-rootB         msftdata
19      3189MB  3210MB  21.0MB  ext4         resin-state         msftdata
20      3210MB  61.2GB  58.0GB               resin-data          msftdata

[ 2]: l4t_flash_from_kernel: Expanding last partition to fill the storage device
[ 2]: l4t_flash_from_kernel: Successfully create gpt for external device
[ 2]: l4t_flash_from_kernel: Starting to flash to emmc
Flash index file is /mnt/internal/flash.idx
Active index file is /mnt/internal/flash.idx
Number of lines is 61
max_index=60
Number of lines is 61
max_index=60
[ 2]: l4t_flash_from_kernel: Starting to flash to external device
Active index file is /mnt/external/flash.idx
Number of lines is 22
max_index=21
[ 2]: l4t_flash_from_kernel: Starting to flash to qspi
QSPI storage size: 67108864 bytes.
writing item=0, 9:0:master_boot_record, 0, 512, mbr_9_0.bin, 512, fixed-<reserved>-0, 694898d1c345bdb31b377790ed7fc0b0db184bf7
writing item=1, 9:0:primary_gpt, 512, 19968, gpt_primary_9_0.bin, 16896, fixed-<reserved>-0, 977155b6b1ea5b0415af6d03a5ed3ba520cab76b
writing item=2, 9:0:A_kernel, 20480, 134217728, , , fixed-<reserved>-2, 
[ 2]: l4t_flash_from_kernel: Warning: skip writing A_kernel partition as no image is specified
writing item=3, 9:0:A_kernel-dtb, 134238208, 786432, , , fixed-<reserved>-3, 
[ 3]: l4t_flash_from_kernel: Warning: skip writing A_kernel-dtb partition as no image is specified
writing item=4, 9:0:A_reserved_on_user, 135024640, 33161216, , , fixed-<reserved>-4, 
[ 3]: l4t_flash_from_kernel: Warning: skip writing A_reserved_on_user partition as no image is specified
writing item=5, 9:0:B_kernel, 168185856, 134217728, , , fixed-<reserved>-5, 
[ 3]: l4t_flash_from_kernel: Warning: skip writing B_kernel partition as no image is specified
writing item=6, 9:0:B_kernel-dtb, 302403584, 786432, , , fixed-<reserved>-6, 
[ 3]: l4t_flash_from_kernel: Warning: skip writing B_kernel-dtb partition as no image is specified
writing item=7, 9:0:B_reserved_on_user, 303190016, 33161216, , , fixed-<reserved>-7, 
[ 3]: l4t_flash_from_kernel: Warning: skip writing B_reserved_on_user partition as no image is specified
writing item=8, 9:0:recovery, 336351232, 83886080, , , fixed-<reserved>-8, 
[ 3]: l4t_flash_from_kernel: Warning: skip writing recovery partition as no image is specified
writing item=9, 9:0:recovery-dtb, 420237312, 524288, , , fixed-<reserved>-9, 
[ 3]: l4t_flash_from_kernel: Warning: skip writing recovery-dtb partition as no image is specified
writing item=10, 9:0:esp, 420761600, 67108864, uefi_jetson.bin, 3342336, fixed-<reserved>-10, 87462abb434db49791f26e029929eb9e170cd539
Writing esp partition with uefi_jetson.bin
Get size of partition through connection.
3342336 bytes from /mnt/external/uefi_jetson.bin to /dev/nvme0n1: 1KB block=3264 remainder=0
dd if=/mnt/external/uefi_jetson.bin of=/dev/nvme0n1 bs=1K skip=0  seek=410900 count=3264
3264+0 records in
3264+0 records out
3342336 bytes (3.3 MB, 3.2 MiB) copied, 0.0443841 s, 75.3 MB/s
Writing esp partition done
writing item=11, 9:0:recovery_alt, 487870464, 83886080, , , fixed-<reserved>-11, 
[ 3]: l4t_flash_from_kernel: Warning: skip writing recovery_alt partition as no image is specified
writing item=12, 9:0:recovery-dtb_alt, 571756544, 524288, , , fixed-<reserved>-12, 
[ 3]: l4t_flash_from_kernel: Warning: skip writing recovery-dtb_alt partition as no image is specified
writing item=13, 9:0:esp_alt, 572280832, 67108864, , , fixed-<reserved>-13, 
[ 3]: l4t_flash_from_kernel: Warning: skip writing esp_alt partition as no image is specified
writing item=14, 9:0:UDA, 639401984, 419430400, , , fixed-<reserved>-14, 
[ 3]: l4t_flash_from_kernel: Skip writing UDA partition
writing item=15, 9:0:reserved, 1058832384, 502792192, , , fixed-<reserved>-15, 
[ 4]: l4t_flash_from_kernel: Warning: skip writing reserved partition as no image is specified
writing item=16, 9:0:resin-boot, 1561624576, 125829120, , , fixed-<reserved>-16, 
[ 4]: l4t_flash_from_kernel: Warning: skip writing resin-boot partition as no image is specified
writing item=17, 9:0:resin-rootA, 1687453696, 750780416, , , fixed-<reserved>-17, 
[ 4]: l4t_flash_from_kernel: Warning: skip writing resin-rootA partition as no image is specified
writing item=18, 9:0:resin-rootB, 2438234112, 750780416, , , fixed-<reserved>-18, 
[ 4]: l4t_flash_from_kernel: Successfully flash the emmc
[ 4]: l4t_flash_from_kernel: Warning: skip writing resin-rootB partition as no image is specified
writing item=19, 9:0:resin-state, 3189014528, 20971520, , , fixed-<reserved>-19, 
[ 4]: l4t_flash_from_kernel: Warning: skip writing resin-state partition as no image is specified
writing item=20, 9:0:resin-data, 3209986048, 57993277440, , , expand-<reserved>-20, 
[ 4]: l4t_flash_from_kernel: Warning: skip writing resin-data partition as no image is specified
writing item=21, 9:0:secondary_gpt, 61203267072, 16896, gpt_secondary_9_0.bin, 16896, fixed-<reserved>-0, 9849235129cc423ba016bdf1bcbfe3244ff27638
[ 4]: l4t_flash_from_kernel: Successfully flash the external device


Flash failure
Either the device cannot mount the NFS server on the host or a flash command has failed. Check your network setting (VPN, firewall,...) to make sure the device can mount NFS server. Debug log saved to /tmp/tmp.FBLkoKsMFK. You can access the target's terminal through "sshpass -p root ssh root@fc00:1:1:0::2" 
System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to connect to bus: Host is down
Cleaning up...

Unfortunately, most of the times, the container hangs like this:

/tmp/bsp-mount/Linux_for_Tegra
***************************************
*                                     *
*  Step 3: Start the flashing process *
*                                     *
***************************************
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for device to expose ssh ......Waiting for device to expose ssh ...Run command: flash on fc00:1:1:0::2
SSH ready

and on the device I see:

bash-5.1# [  234.467261] nfs: server fc00:1:1:0::1 not responding, timed out
[  234.467284] NFS: state manager: lease expired failed on NFSv4 server fc00:1:1:0::1 with error 110

If I run the exact same flash command but from the hostOS from /tmp/bsp-mount/Linux_for_Tegra, which is the bind-mount of the container and has the exact same sources, flashing always works as expected and completes successfully.

In the container I’ve installed openssh-server as per the above forums post and am starting it with:

docker container run --rm -it --privileged -v /dev/:/dev/ -v ~/images:/data/images -v /tmp/bsp-mount/Linux_for_Tegra:/tmp/bsp-mount/Linux_for_Tegra -v /run/nvidia_initrd_flash/docker_host_network:/run/nvidia_initrd_flash/docker_host_network  -v /dev/bus/usb:/dev/bus/usb/ --network host -p 2222:22 orin-image /bin/bash

I’ve tried running sshpass -p root ssh root@fc00:1:1:0::2 after the failure and am getting:

ssh: connect to host fc00:1:1::2 port 22: Network is unreachable

from both the HostOS and from the Docker container. Both Docker and HostOS are Ubuntu 22.04.

Thanks

*** Please note that this reply is generated by LLM automatically ***

Based on the provided context, it appears that you are experiencing issues with flashing a Jetson Orin Nano device using a custom Docker container. The container hangs during the flashing process, and you are unable to complete the flashing successfully.

To troubleshoot this issue, let’s compare the differences between running the flash command from the host OS and the Docker container.

Host OS:

  • You have installed the necessary packages and dependencies on the host OS.
  • You are running the flash command from the host OS, which has the exact same sources as the Docker container.
  • Flashing always works as expected and completes successfully from the host OS.

Docker Container:

  • You have installed the necessary packages and dependencies in the Docker container, as per the forum post.
  • You are starting the container with the --privileged flag, which allows the container to access the host’s hardware.
  • However, the container hangs during the flashing process, and you are unable to complete the flashing successfully.

Based on the differences between the host OS and the Docker container, here are a few potential causes for the issue:

  1. Privilege issues: Although you are running the container with the --privileged flag, it’s possible that the container is still lacking some privileges or permissions to access the host’s hardware.
  2. Dependency issues: Although you have installed the necessary packages and dependencies in the container, it’s possible that there are some dependencies that are not installed or not compatible with the container’s environment.
  3. Resource constraints: Docker containers have limited resources (e.g., CPU, memory, and I/O) compared to the host OS. It’s possible that the container is running out of resources during the flashing process, causing it to hang.

To troubleshoot this issue further, you can try the following:

  1. Check the container’s logs: Run the command docker logs -f <container_id> to check the container’s logs and see if there are any error messages or warnings that can help you identify the issue.
  2. Check the host’s logs: Run the command dmesg to check the host’s kernel logs and see if there are any error messages or warnings related to the flashing process.
  3. Increase the container’s resources: Try increasing the container’s resources (e.g., CPU, memory, and I/O) to see if it resolves the issue.
  4. Use a different Docker image: Try using a different Docker image that is specifically designed for flashing Jetson devices to see if it resolves the issue.

By comparing the differences between the host OS and the Docker container, and by troubleshooting the issue further, you should be able to identify the root cause of the problem and find a solution to resolve it.

*** Please give the thumbs up if you get answers you like. Or provide feedback to help us improve the answer. ***

Is there a Docker container & associated command which could be used for this purpose?

Hi @AlexCo,

I hope you are doing well!

I tried replicating the same behavior on our end to try to give you a hand, and we were able to get to the same state as you are. I am using an Orin NX+Devkit with host PC Ubuntu 24.04.

I tried following the procedure on the forum shared. This are the commands I ran:

On host PC:

  1. Added to /etc/exports:
/path/to/Linux_for_Tegra/rootfs *(rw,nohide,insecure,no_subtree_check,async,no_root_squash)
/path/to/Linux_for_Tegra/tools/kernel_flash/images *(rw,nohide,insecure,no_subtree_check,async,no_root_squash)```
  1. Change permissions:
sudo chmod 755 /path/to/Linux_for_Tegra/rootfs
sudo chown root:root /path/to/Linux_for_Tegra/rootfs

sudo chmod 755 /path/to/Linux_for_Tegra/tools/kernel_flash/images
sudo chown root:root /path/to/Linux_for_Tegra/tools/kernel_flash/images
  1. Exportfs and restart services
sudo exportfs -r
sudo systemctl restart nfs-kernel-server
sudo systemctl restart rpcbind
  1. Run docker:
docker run -it --rm --privileged \
    --network host \
    -v /dev/bus/usb:/dev/bus/usb \
    -v /dev:/dev \
    -v /run/nvidia_initrd_flash/docker_host_network:/run/nvidia_initrd_flash/docker_host_network \
    -v /path/to/Linux_for_Tegra:/path/to/Linux_for_Tegra:slave \
    -p 2222:22 \
    flash_jetpack_docker bash

Inside docker:

  1. Flash the board.

It gets stuck on:

Waiting for device to expose ssh ......Waiting for device to expose ssh ...Run command: flash on fc00:1:1:0::2
SSH ready

And then get error log:

22:00:36.693 - Error: Flash failure.
22:00:36.696 - Error: Either the device cannot mount the NFS server on the host or a flash command has failed. Check your network setting (VPN, firewall,...) to make sure the device can mount NFS server. Debug log saved to /tmp/tmp.sJWg7k4kSR. You can access the target's terminal through "sshpass -p root ssh root@fc00:1:1:0::2" 
22:00:36.698 - Debug: The last 5 lines of the debug log are:
SSH ready
System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to connect to bus: Host is down
Cleaning up...

And when trying to access:

sshpass -p root ssh root@fc00:1:1:0::2
ssh: connect to host fc00:1:1::2 port 22: Network is unreachable

I will be trying a few things and let you know if i get it to work!

Best regards,
Nico
Embedded Software Engineer at ProventusNova

Hi,

We do support flashing with a Docker container via SDKmanager:

But it doesn’t support external storage but there is a successful case from the community for your reference:

Thanks.

Hi @AastaLLL and thank you for your response.

I’ve checked the linked post and it seems that was from a Virtual Machine and not from a Docker container. I’ve tried the changes shared in that post but unfortunately am getting the same result:

/tmp/bsp-mount/Linux_for_Tegra
***************************************
*                                     *
*  Step 3: Start the flashing process *
*                                     *
***************************************
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for target to boot-up...
Waiting for device to expose ssh ......Waiting for device to expose ssh ...Run command: flash on fc00:1:1:0::2
SSH ready <-- stall here
Flash failure
Either the device cannot mount the NFS server on the host or a flash command has failed. Check your network setting (VPN, firewall,...) to make sure the device can mount NFS server. Debug log saved to /tmp/tmp.2Sd7zEuyp8. You can access the target's terminal through "sshpass -p root ssh root@fc00:1:1:0::2" 
System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to connect to bus: Host is down
Cleaning up...

Now, just to be sure I understood correctly, can you please confirm the below points:

  • SDK manager can be used inside Docker, but the NVME flashing is not supported with this method, as mentioned in Device booting in initrd cannot mount to folders on NFS server - #9 by AastaLLL . This because flashing of external devices from inside Docker is not stable and may exhibit various failures.
  • Flashing from command line using l4t_initrd_flash.sh is not supported from inside a Docker container. It is only supported from a host computer running Ubuntu.

If both statements are true, then I suspect the issue I encounter is the same as point 1 - flashing of external storage (NVME) is not supported from inside a container, regardless of the tool used - SDK Manager or l4t_initrd_flash.sh

Thanks

Hi,

SDK manager can be used inside Docker

No, but we have the SDKmanager container so you can run it on a non-supported device.

Do you want to flash the device inside the container?
If so, could you share your use case so we can check if a suitable tool exists for you?

Thanks.

Indeed @AastaLLL , our use-case is to flash the device using a container. Our test system uses containers for provisioning, and we would like to start a Docker container from another container (docker in docker) to flash Orin devices. We can use an X86 PC as a host of course to run the Docker container.

We’d also like to modify some of the BSP files before starting the flashing process (i.e modify partitions inside XML, cvb_eeprom_read_size etc), so we don’t want SDK manager to download the sources, but only to use a CLI to trigger the flashing of the QSPI and the NVME plugged to the Jetson Orin carrier board.

Thank you

Hi,

If you don’t want to use the SDKmanager, please check the document below for manually flashing:

But we don’t support manually flashing inside the container so it’s not guaranteed to work.

Thanks.

Hi @AastaLLL , so to clarify, could you please give me a “yes” or “no” for each of the questions below?

  • SDK Manager can be used to flash a Jetson Orin (QSPI + NVME on carrier board) from a Docker container?

  • SDK manager in container can use modified BSP source instead of downloading reference BSP & Ubuntu?

  • Does SDK manager also provide a CLI interface (instead of GUI) for triggering flashing from inside container using an already unpacked and modified Driver Package (BSP) and Sample Root Filesystem?

Thank you

Hi,

Using SDK Manager inside the container is not officially supported, so it might not work.
Even for the CLI tool, we don’t support the case running within a container.

Thanks.