Trouble flashing Orin AGX 32GB for jetpack 6.0 using system docker. Failed to find 'dtc'

Hi All,

I’m looking for help with installing Jetpack 6.0 on my Orin AGX DevKit.

I’m running Ubuntu 22.04 on my host and using the docker image for sdkmanager.

I’ve loaded the qemu-system-user package and run the updates as recommended in the download page.

I’ve also disabled the usb autosuspend feature, just to be sure it’s not an issue.

The issue I think I’m having is that the system is copying some data to a temporary location, but not including the ‘dtc’ application for some reason, despite it being in the correct place as far as I can make out from other similar topics.

Below are a few things that may be useful for identifying the cause of the issue:

  • command for starting the sdkmanager
  • snippet from log file.
  • docker image details
  • confirmation of dtc locations within L4T image (docker exec while sdkmanager is running, apparently ready to flash and waiting for user feedback to continue)

Any advice would be appreciated!
Thanks for your time,

Will.

Command for installing and flashing:
actually started as sdkmanager --cli but after the wizard config, it gives the command below, which may be useful

sdkmanager --cli --action install \
    --login-type devzone --product Jetson --target-os Linux --version 6.0 \
    --show-all-versions --host --target JETSON_AGX_ORIN_TARGETS \
    --additional-sdk 'DeepStream 7.0' --select HOST --select 'Jetson Linux' \
    --select 'Jetson Linux image' --select 'Flash Jetson Linux' --select 'Jetson Runtime Components'
    --select 'Additional Setups' --select 'CUDA Runtime' --select 'CUDA X-AI Runtime' \
    --select 'Computer Vision Runtime' --select 'NVIDIA Container Runtime' --select Multimedia \
    --select 'Jetson SDK Components' --select CUDA --select 'CUDA-X AI' \
    --select 'Computer Vision' --select 'Developer Tools' --select DeepStream \
    --select 'Jetson Platform Services' --select 'Jetson Platform Services' --flash \
    --license accept

Snippet from log where things seem to go wrong:
NB: I’ve made one very minor modification to the python file to print out the CWD and show the commands that is about to be executed for debug purposes.

11:44:00.032 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS: [   0.2207 ] dtc -I dts -O dtb -o tegra234-mb1-bct-device-p3701-0000_cpp.dtb tegra234-mb1-bct-device-p3701-0000_cpp.dts
11:44:00.032 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS: CWD:  /home/nvidia/nvidia/nvidia_sdk/JetPack_6.0_Linux_JETSON_AGX_ORIN_TARGETS/Linux_for_Tegra/bootloader/7629
11:44:00.032 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS: 
11:44:00.033 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS:     tegraflash_run_commands()
11:44:00.033 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS:   File "/home/nvidia/nvidia/nvidia_sdk/JetPack_6.0_Linux_JETSON_AGX_ORIN_TARGETS/Linux_for_Tegra/bootloader/tegraflash.py", line 1276, 
in tegraflash_run_commands
11:44:00.033 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS: 
11:44:00.033 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS:     interpreter.onecmd(command)
11:44:00.033 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS:   File "/usr/lib/python3.10/cmd.py", line 217, in onecmd
11:44:00.033 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS: 
11:44:00.034 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS:     return func(arg)
11:44:00.034 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS:   File "/home/nvidia/nvidia/nvidia_sdk/JetPack_6.0_Linux_JETSON_AGX_ORIN_TARGETS/Linux_for_Tegra/bootloader/tegraflash.py", line 893, i
n do_dump
11:44:00.034 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS: 
11:44:00.034 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS:     self.chip_inst.tegraflash_dump(exports, args)
11:44:00.034 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS:   File "/home/nvidia/nvidia/nvidia_sdk/JetPack_6.0_Linux_JETSON_AGX_ORIN_TARGETS/Linux_for_Tegra/bootloader/tegraflash_impl_t234.py", l
ine 2715, in tegraflash_dump
11:44:00.035 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS: 
11:44:00.053 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS:     self.tegraflash_preprocess_configs()
11:44:00.053 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS:   File "/home/nvidia/nvidia/nvidia_sdk/JetPack_6.0_Linux_JETSON_AGX_ORIN_TARGETS/Linux_for_Tegra/bootloader/tegraflash_impl_t234.py", l
ine 362, in tegraflash_preprocess_configs
11:44:00.053 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS: 
11:44:00.053 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS:     values[config] = self.run_dtc_tool(
11:44:00.053 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS:   File "/home/nvidia/nvidia/nvidia_sdk/JetPack_6.0_Linux_JETSON_AGX_ORIN_TARGETS/Linux_for_Tegra/bootloader/tegraflash_impl_t234.py", l
ine 3657, in run_dtc_tool
11:44:00.053 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS: 
11:44:00.054 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS:     run_command(command, True)
11:44:00.054 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS:   File "/home/nvidia/nvidia/nvidia_sdk/JetPack_6.0_Linux_JETSON_AGX_ORIN_TARGETS/Linux_for_Tegra/bootloader/tegraflash_internal.py", li
ne 355, in run_command
11:44:00.054 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS: 
11:44:00.088 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS:     process = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, shell=use_shell, env=cmd_environ)
11:44:00.089 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS:   File "/usr/lib/python3.10/subprocess.py", line 971, in __init__
11:44:00.090 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS: 
11:44:00.091 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS:     self._execute_child(args, executable, preexec_fn, close_fds,
11:44:00.092 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS:   File "/usr/lib/python3.10/subprocess.py", line 1863, in _execute_child
11:44:00.092 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS: 
11:44:00.093 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS:     raise child_exception_type(errno_num, err_msg, err_filename)
11:44:00.094 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS: FileNotFoundError: [Errno 2] No such file or directory: 'dtc'
11:44:00.094 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS: 
11:44:00.119 - error: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS: --- Error: Reading board information failed.
11:44:00.119 - info: Event: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS - error is: --- Error: Reading board information failed.
11:44:00.120 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS: 
11:44:00.123 - error: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS: [exec_command]: /bin/bash -c /tmp/tmp_NV_L4T_FLASH_JETSON_LINUX_COMP..sh; [error]: --- Error: Reading board information failed.
11:44:00.124 - info: Event: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS - error is: [exec_command]: /bin/bash -c /tmp/tmp_NV_L4T_FLASH_JETSON_LINUX_COMP..sh; [error]: --- Error: Reading board informati
on failed.
11:44:00.124 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS: 
11:44:00.124 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS: 
11:44:00.124 - info: NV_L4T_FLASH_JETSON_LINUX_COMP@JETSON_AGX_ORIN_TARGETS: [ Component Install Finished with Error ]

docker image:

sdkmanager           2.1.0.11682-Ubuntu_22.04   45897d7a36d1   2 weeks ago     970MB

DTC locations:

nvidia@642862197fee:~/nvidia/nvidia_sdk/JetPack_6.0_Linux_JETSON_AGX_ORIN_TARGETS$ sudo find . -iname dtc
./Linux_for_Tegra/kernel/dtc
./Linux_for_Tegra/rootfs/usr/bin/dtc
./Linux_for_Tegra/rootfs/usr/src/linux-headers-5.15.136-tegra-ubuntu22.04_aarch64/3rdparty/canonical/linux-jammy/kernel-source/scripts/dtc
./Linux_for_Tegra/rootfs/usr/src/linux-headers-5.15.136-tegra-ubuntu22.04_aarch64/3rdparty/canonical/linux-jammy/kernel-source/scripts/dtc/dtc
./Linux_for_Tegra/rootfs/usr/src/linux-headers-5.15.136-tegra-ubuntu22.04_aarch64/3rdparty/canonical/linux-jammy/kernel-source/include/config/DTC

Please ignore my previous comment if you have received the notification.
Please check whether dtc is directly available in your docker environment.

Looks like the copy in Linux_for_Tegra/kernel/dtc is only used when generating recovery images for OTA updates.
All three other copies you list are all ARM64 binaries, which have nothing to do with your x86 host PC.

Hi @DaveYYY

Thanks for your quick reply!
I do not seem to have dtc in my docker env.

Is it installed dynamically or missed from the Dockerfile?

I’ve tried adding dtc with:

sudo apt-get install -y device-tree-compiler

Into my docker env and tested to see if anything else is missing.

I tried to continue with the sdkmanager, but there’s at least one more issue: it’s missing uuidgen, perhaps from uuid-runtime?

I’ll plug away adding bits on, but I’m guessing something has gone wrong with my docker environment or the Dockerfile needs updating? How can I conclusively answer that and help resolve the root issue?

For reference: I downloaded this image linked from SDK Manager | NVIDIA Developer

Will.

I’ve found two more packages that are needed, so the full list is now:

sudo apt-get install -y \
  device-tree-compiler \
   whois \
   libxml2-utils

(I guess whois is probably not really required, but it is asked for in a ‘please run’ message in the log)

Looks like that has got it to finally start flashing (~22% so far). I really hope it succeeds to the end now! (though I expect the recovery mode I’m using will still work)

In case there are any more gotchas: if there is a faster way to test for success it would be appreciated!

Will.

We have a script for installing required packages during flashing:

sudo ./tools/l4t_flash_prerequisites.sh

(I had to deviate on to other work, but I’m back for a bit now)

So, I’m almost there - but the device seems to be unusable as is.

The flash process seems to almost work, then fail. It looks like many processes succeed, but when it’s in the region of 94 to 96% complete it fails (over and over again!) - it looks like one process is getting in the region of 5% to 20% complete, then it freezes without any obvious reason or explanation. The progress indicator seems to keep ticking up despite the sudden stopping of updates in the log file, so perhaps something else is going on, but it seems just to timeout and report failure.

I’m going to restart everything from scratch and see if it’s some kind of cached issue, but is there a way to recover a Jetson Orin AGX from SD card?

Alternatively, how do I manually install things as opposed to using the sdkmanager?
I notice there’s detailed instructions for installing Linux for Jetson, but I do not know how to obtain Linux for Jetson outside of the SDK manager, or Is the best way to use the SDK manager to download Linux for Jetson, then stop using it and move across to using the scripts inside Linux for Jetson?

Thanks!

Thanks for flagging that pre-requisite script. I’ve removed a couple of packages to make it more appropriate for docker, so the list of packages I’m manually installing is:

package_list=(abootimg binutils cpio cpp device-tree-compiler dosfstools
lbzip2 libxml2-utils openssl python3-yaml
sshpass uuid-runtime whois rsync zstd)

I’ve ensured the following are installed on the host:
nfs-kernel-server udev qemu-user-static binfmt-support

(Edited: below)
On reflection I’ve installed these packages inside the docker container aswell, just to be sure there are no user tools needed.

sdkm-2024-07-09-12-51-37.log (3.8 MB)
The process has just completed and failed again.

I’m attaching the log in case there is anything that may explain what’s causing the latest issue.

Please attach the log that SDKM outputs as a zip file.

OK, I’ve attached the same log and the rest of the logs directory as a .zip.

2024-07-09_logs.zip (304.5 KB)

Thanks!

13:48:13.356 - Info:  
13:48:13.356 - Info:  
13:48:13.356 - Info:  
13:48:13.356 - Info:  
13:48:13.356 - Info: ]
13:48:13.356 - Info:  
13:48:13.356 - Info: 0
13:48:13.356 - Info: 0
13:48:13.356 - Info: 5
13:48:13.356 - Info: %

Your log contains a bunch of empty lines, which makes it hard to where it stops.
Can you please try again?

I can, but that is just how the logs always look. It’s frustrating isn’t it!

I’ve used grep -v 'Info: *$' <file> | less to explore the files, but it’s still not exactly nice! I suspect the way results from a popen are being logged has been modified to ensure that timestamps are accurate (not left waiting for a terminating new line) could be improved (wait for either a new line or a very small period of time to elapse and report the time the first byte came in).

Still, it is what it is and in my experience repeating will not change anything, but if you want to see this for yourself I can repeat it, just let me know!

Working with what I have the failing process starts with:

[ 471.3286 ] Writing partition APP with system.img [ 6662161452 bytes ]

(I’ve semi-tidied up the output with the following awk script, but it only fixes half of the issue as only the tegra flash process is outputting ^M - so it condenses the rest of the output more than it should be condensed).

awk 'BEGIN{cumulator="";} {full=$0; first=index($0,"Info:")+6; cr=index(full,"^M"); while(cr > 0 ){ print( accumulator substr(full,first,cr-first)); accumulator="";first=0;full=substr(cr+1,length(full)-cr);cr=index(full,"^M");} if(length(full) > 0) {accumulator=accumulator substr(full,first,length(full)-first+1);}}  END {print accumulator;}' NV_L4T_FLASH_JETSON_LINUX_COMP.log   | less

before beginning I confirmed the usbcore autosuspend setting is at -1 with

$ cat /sys/module/usbcore/parameters/autosuspend 
-1

and it had been unplugged and plugged back in, but I could not find the docs that tell me to do this again, so I may have missed a step?

Also, I notice the log output contains a number of ‘WARNING’ lines about DTBs:
[ 9.1077 ] Parsing config file: tegra234-mb1-bct-padvoltage-p3701-0000-a04_cpp.dtb ... [ 9.1235 ] WARNING: unknown node 'g9'

could that be a clue to the cause?

What’s the full readable log that you can get now?

Hi @DaveYYY

I’ve attached the logs after re-parsing them. They’re not great in this format either as not everything outputs ^M at the end of the line, but you can at least read the last few process information more easily (and the ascii art progress bars don’t make a mess of things - it’s just all the stuff before them that’s harder to follow in this version…)

2024-07-09_logs_reparsed.zip (234.8 KB)

That archive includes the complete log and the other logs from the logs directory.

while I was regenerating this I spotted some early warnings about ‘boot chain not completed set to 0’ - which left me confused. On the one hand it’s a warning, on the other hand it seems to be saying ‘boot chain is not complete’ is false.

The final lines in the file also seem to suggest that there’s no space on a (host?) disk, which is also a surprise!

Checking inside the container the root partiton has 47GB free, the /home/nvidia user partition has 2.7TB free and /dev/shm has 7.7GB free.
For complete transparency, it is true my [host/external] root partition is down to it’s last GB, but that shouldn’t affect anything inside docker - so I’m leaning towards that being a red herring, possibly related to overlays inside docker somehow?
But I guess it could be a permission issue and an unfortunate error message?

Or does it really mean something on the device has not responded/run out of disk space?

I’ve got a lot of trees to bark up and not much clarity over which one contains the golden goose I’m looking for. Feels like the nature of wizards, even though this wizard has copious logs.

Any recommendations appreciated.

What do you mean with [host/external] here?
Can you try running SDKM natively instead of inside docker?