AGX Xavier image based Ota update process never starts from nvme

I am attempting to perform an image based OTA upgrade from 32.7 to 35.1 on a non-dev Xavier AGX remotely.

I have access to a jetson on hand which has been flashed to be in the same state as our remote units (R32.7) on the nvme drive including some customization the to rootfs from our supplier for testing that the update works before I push to our remote units which are only accessible via ssh.

I have followed this guide on my test board up to step 9 “If no error occurred in step 8, reboot the target board.”
https://docs.nvidia.com/jetson/archives/r35.4.1/DeveloperGuide/text/SD/SoftwarePackagesAndTheUpdateMechanism.html

When I reboot the recovery image is never loaded and it boots straight into R32.7 running on the nvme and there is no log found in ota_logs.

I have tested with the nvme removed from the jetson and the ota_payload_package works(ie updates to the version in the ota update) if the jetson is running on the emmc storage but is unable to be flashed if it was originally flashed to the nvme using SDK manager.

I am happy to switch to emmc rootfs or have the nvme be OTA updated although it looks like thats only supported on the nx modules at this time.

Is there any way to get this to work ie changing from nvme to emmc as the default rootfs after flashing or flash to the nvme itself without having physical access to the device.

When flashing I notice that the ./nv_ota_start.sh moves the extlinux.conf file from /boot so im curious as to how it knows to boot into the recovery partition.

Any help would be much appreciated.

Hi hugh2,

Are you using AGX Xavier on a custom carrier board?
Do you have any custom layout change in XML?

Please share the full serial console log after you run reboot command.

and also the log when you are generating the OTA package on the host.

Let me clarify your use case:

  1. you are using AGX Xavier on a custom carrier board with NVMe connected
  2. It is booted from NVMe with R32.7.x
  3. you want to perform image-based OTA for NVMe to update to R35.1

Please help to check if my understanding is correct.

Hi Kevin,

  1. Yes I am using a custom carrier board for AGX Xavier with nvme connected
  2. Yes it is booted from nvme
cat /etc/nv_tegra_release
# R32 (release), REVISION: 7.2, GCID: 30192233, BOARD: t186ref, EABI: aarch64, DATE: Sun Apr 17 09:53:50 UTC 
2022

df
Filesystem     1K-blocks     Used Available Use% Mounted on
/dev/nvme0n1p1  28768292 10938556  16345348  41% /
none            15958496        0  15958496   0% /dev
tmpfs           16343184       40  16343144   1% /dev/shm
tmpfs           16343184    30464  16312720   1% /run
tmpfs               5120        4      5116   1% /run/lock
tmpfs           16343184        0  16343184   0% /sys/fs/cgroup
tmpfs            3268636      112   3268524   1% /run/user/1000
/dev/mmcblk0p1  28768292    86688  27197216   1% /media/nvidia/2ed8837d-6eb5-43e6-bb21-dd7ba59fb900
  1. I want to perform imaged OTA to update to R35.1 I dont mind whether it is to the NVMe or EMMC

Build logs on my host device
build_logs.txt (582.5 KB)

Here is my ota_start command on the jetson
sudo ./nv_ota_start.sh /ota/ota_payload_package.tar.gz

start.sh_log.txt (8.7 KB)

dmesg command after rebooting directly after the above command was performed (which from my research is the same as the serial output) let me know if this is incorrect

dmesg(serial logs).txt (79.2 KB)

It seems your OTA package is generated for internal eMMC.
Could you share the command you used to generate OTA package?

I would like to check the serial console log rather than dmesg.
For serial console log from the devkit, you could refer to the following instruction.
NVIDIA Jetson Xavier - Serial Console (ridgerun.com)
For your custom carrier board, you should check where is the debug UART.

The command for running the build is at the top of the build logs.txt

ROOT_DIR=/home/conducivelaptop/ota
BASE_BSP=${ROOT_DIR}/JetPack_4.6.2_Linux_JETSON_AGX_XAVIER_TARGETS/Linux_for_Tegra
TARGET_BSP=${ROOT_DIR}/JetPack_5.0.2_Linux_JETSON_AGX_XAVIER_TARGETS/Linux_for_Tegra
JETSON_VER=R32-7 
TARGET_BOARD=jetson-agx-xavier-devkit

# Recovery image
sudo ./tools/ota_tools/version_upgrade/build_base_recovery_image.sh ${TARGET_BOARD} ${JETSON_VER} ${BASE_BSP} ${BASE_BSP}/rootfs ${TARGET_BSP}


# GENERATE OTA
sudo BASE_BSP=${ROOT_DIR}/JetPack_4.6.2_Linux_JETSON_AGX_XAVIER_TARGETS/Linux_for_Tegra ./tools/ota_tools/version_upgrade/l4t_generate_ota_package.sh ${TARGET_BOARD} ${JETSON_VER} 

Theres no instructions for generating OTA package for NVME on the AGX hence why I did EMMC is there a way to generate for NVME that I am missing?

I am unable to get any serial logs while the device is booting via the USB or via the uart connection (Via a known working serial converter) it may be something to do with the carrier board

Since it reboots just as fast as before initiating the ota update I feel its not even attempting to do any boot up sequence. Is there a different way to get the boot logs or a setting I need to change on the jetson itself?

It seems you run this command in wrong path.
You could generate the OTA package in your target BSP. (i.e. in your case, you should run it under R35.1 BSP package)

Okay, image-based OTA update supports only Xavier NX currently.
For AGX Xavier, it should be supported image-based OTA for NVMe in next release.

Do you mean that you could not capture serial console log on the custom carrier board? Is this board designed by you?

Note:

TARGET_BSP=${ROOT_DIR}/JetPack_5.0.2_Linux_JETSON_AGX_XAVIER_TARGETS/Linux_for_Tegra

Sorry should have specified before I ran the command I did

cd ${TARGET_BSP} 

so it is definitely running in the correct path as I didn’t copy the build tools into the Jetpack_4.6… path.

Is there an ETA on the next release to support image-based OTA for NVMe on the AGX?

The uart console log is not working as directed on the documentation from our carrier board supplier we are in comms with them about it.

For now is there any way for me force the jetson to ignore the nvme on boot so that the jetson uses what is written onto the emmc by the sudo ./nv_ota_start.sh /ota/ota_payload_package.tar.gz command? Or any changes I can make to the command to do this?

It is supposed to be in the end of next month.

We could verify image-based OTA for eMMC now with your current steps but we would not know what happen inside during update without serial console log.

The board flashes correctly if I remove the NVMe drive from carrier board
Can you please verify the steps
Flash board to NVME via SDK manager.
Follow steps to create ota_image for EMMC
Follow steps to start ota on the jetson which has been flashed to NVMe
Reboot and send what the logs should look like?

Here
serial logs.txt (79.2 KB)

Why you generate OTA package for eMMC but use it to flash into NVMe?

I would like to check the full serial console log after you run the reboot command rather than just the dmesg after boot up.

I have to generate for EMMC because as we discussed NVMe is not supported until the next update what would you recommend I do instead?

Is there a setting I have to change to get the serial logs to output? Nothing seems to work

For the devkit, you could refer to the following steps to capture serial console log.
NVIDIA Jetson Xavier - Serial Console (ridgerun.com)
For the custom carrier board, please check with your vendor.
Is this board designed by you? If so, please check with HW team.

I would suggest verifying updating eMMC first. (you could boot from eMMC but mount NVMe as rootfs)

I have verified that the ota update package works if the nvme is disconnected the issue is when the nvme is connected it automatically boots from that and since our remote units are all flashed to the NVMe we are unable to use the OTA update because we cannot physically disconnect the nvme is there any work around like tellings the nvme kernel to boot from the emmc instead?

Jetson device would prefer boot from the new added (external) device.
You could refer to the following thread to configure its order to bottom.
64G AGX Orin boot Order Getting Reset (JetPack 5.1.1 - L4T r35.3.1) - #9 by KevinFFF

All this says is that I need to enter the UEFI menu to change the boot order which requires connecting to the uart and physical access to the device I assumed the whole point of OTA update was so that the update can be performed “Over the air” ie without physical access.

Is there a way to initialise a OTA update on a device that is running on NVMe and tell it on next boot to boot from the emmc?
Alternatively what is the reason AGX doesn’t support flashing to emmc? Can I use the same partitions as the nx module?

Figured out how to get the logs
Jetson Serial Logs.txt (103.5 KB)

You could try to configure boot-order in L4TConfiguration.dtbo with this instruction.

What do you mean about “AGX doesn’t support flashing to emmc”?

We want to check serial console log after you run reboot command to trigger OTA update process rather than just from boot up.

Those instruction say “Flash the device” as the last step this is not helpful when I cannot physically access the device. Am I correct in saying that the R32.7 does not use UEFI boot and is using CBoot? So these instructions are only good for R35.1

Sorry I mean using OTA update for NVMe on the AGX

When generating this log I ran

sudo ./nv_ota_start.sh /ota/ota_payload_package.tar.gz 
sudo reboot

I noticed it doesnt even attempt to boot from the EMMC which is where the update is meant to occur from.

Yes.

To generate the OTA package, please just check for the guide in “Target version”.

I would suggest performing this in next release, it would be supported.

Maybe it is booting from eMMC but mount NVMe as rootfs.

So is there documentation for getting it to boot from emmc when the current version is R32.7 SO that it can start using R35.1. I feel the documentation you sent is only useful if the device has been updated which is what I am asking you for help with.

Yes this is what I did there is no instructions for the target version OTA package generation for a device that is already running on NVMe can you please help with this?

So you are saying there is no current solution to this problem? We have to wait for nvidia to release the update THEN our supplier to update the customised packages this could be months without being able to update our devices.

Am I able to check this somehow? Do the logs that I generated show this information?