AGX Xavier Jetson Linux Image Based OTA Update from r32.7.X to r35.1 Fails - Black screen with flashing cursor

I’m running through the OTA process to upgrade Jetson Xavier AGX Devkit modules from 32.5 to 35.1.

Heres the process I’m following:

Part 1 - Update r32.5 to r32.7 - Jetson Machine

  1. Update Jetson to 32.7 by:
    sudo vim /etc/sources.list.d/nvidia-l4t-apt-source.list
    and updating the release number from 32.5 to 32.7
  2. Run sudo apt update
  3. Run sudo apt dist-upgrade. I’ve also used sudo apt dist-upgrade -y -o Dpkg::Options::="--force-confnew" to speed this up.
  4. Reboot jetson for good measure

This works nicely and I can verify the release has in fact been updated as expected. Great.

Part 2 - Update r32.7 to r35.1 - Host machine
I’m following the guide here: Jetson Linux r35.1 Docs

  1. Download the sources for r35.1 and r32.7, and the file systems for both respectively.
  2. Unpack the file systems into their respective directories (<rel>/Linux_for_Tegra/rootfs)
  3. I apply the binaries to both releases using the apply_binaries.sh script. Note I have also tried not applying binaries to the current release (i.e. r35.1) too.
  4. I add a default user to both releases using the ./tools/l4t_create_default_user.sh script. Note I have tried not doing this, doing this on one and the other with no success.
  5. I export the variables BASE_BSP and TARGET_BSP to <my/path/to/release>/Linux_for_Tegra respectively for each release. i.e. BASE_BSP=/home/username/R32.7.3/Linux_for_Tegra
    and
    TARGET_BSP=/home/username/R35.1/Linux_for_Tegra.
    Note its different in real life as our paths are sensitive, but the idea is the same, and I’m just following the instructions.
  6. I install the ota tools to ${TARGET_BSP} as directed.
  7. I generate the OTA update payload with the following:
    sudo -E ./tools/ota_tools/version_upgrade/l4t_generate_ota_package.sh jetson-agx-xavier-devkit R32-7, obviously from within ${TARGET_BSP}.
  8. I scp the payload to the target Jetson device.

Part 3 - Update r32.7 to r35.1 - Jetson machine

  1. I review /boot/extlinux/extlinux.conf to validate that:
    INITRD /boot/initrd
    and that the boot device is set with something like
    root=/dev/mmcblk0p1 rw rootwait rootfstype=ext4
  2. Download the ota_tools_R35.1.0_aarch64.tbz2 OTA tool package.
  3. I create a directory ota in the home directory of our default user, i.e. ~/ota and set WORKDIR to the complete path of this folder.
  4. I unpack ota_tools_R35.1.0_aarch64.tbz2 to the ${WORKDIR} as instructed
  5. I create an /ota/ directory
  6. I copy the ota_payload_package.tar.gz to the /ota/ folder.
  7. I unpack the OTA payload package. Note, here you actually send mixed messages about whether to unpack the archive or not. The script seems to do it, and I have tried both doing it and not doing it, neither fixes my issue.
  8. I then run the update script:
cd ${WORKDIR}/Linux_for_Tegra/tools/ota_tools/version_upgrade
sudo ./nv_ota_start.sh /dev/mmcblk0 /ota/ota_payload_package.tar.gz
  1. I get no errors, here and reboot the device.
  2. When the device reboots, it flashes the NVIDIA splash screen twice, then goes to black with a flashing horizontal cursor in the left hand corner of the screen.

The device remains in this bricked state indefinitely. I am able to ping the device, and attempt ssh, but ssh fails with authentication on the default user I create. I’ve also tried with various other logins with no success.

I have repeated this process (with the variations discussed above) by reflashing the device back to a fresh r32.5 and following the above, but nothing seems to work.

Bump. This is blocking for our team, we need to have a robust way to upgrade to the latest release. Please can we get a response soon.

Update: connected to the serial console, and this seems to be some of the relevant details:

init_ota_log /mnt/ota_log
Create log file at /mnt/ota_log/ota_20230113-025911.log
OTA_LOG_FILE=/mnt/ota_log/ota_20230113-025911.log
init_exception_handler /mnt /mnt/ota_log/ota_20230113-025911.log 1
ota_validate_payload /mnt/ota_work jetson-agx-xavier-devkit R32-7
Validating OTA payload
ota_check_rollback /mnt/ota_work jetson-agx-xavier-devkit R32-7
Invalid target board jetson-agx-xavier-devkit
Failed to run "ota_check_rollback /mnt/ota_work jetson-agx-xavier-devkit R32-7"
OTA retry count file is at /mnt/ota_retry_count
OTA retries 1 time(s)
Reached OTA max retries (1 times)

No idea why jetson-agx-xavier-devkit isn’t a valid board, that is what you’ve got in the documentation here, and it is a devkit module.

hello khan.schroder-turner,

please refer to Topic 230704 for steps to update from r32.7.2 to r35.1

This method works with the exception of one step.

I ran into this issue:

Basically, when I tried to do the upgrade from 32.7.2 → 35.1.0, I ran into:
Error: "BASE_BSP" is not set
when running the ./tools/ota_tools/version_upgrade/l4t_generate_ota_package.sh script. I checked manually many times that it was correctly set. Tried running it inline, sudo exporting the vars too but nothing worked.

The work around I ended up using was to switch to the root user, export the variables, then run the script. Then proceeded with the update as usual. This appears successful.

Feedback/comments:

You may like to formalise and update your documentation to include the steps from that post. At the moment there is nothing formally documented (outside of that forum post, which didn’t come up in any of my searching) that we would need to do that, and that post is pretty average from a documentation stand point - I had to read it a good 5-10 times to fully pick out what had to be done. Either that or update the scripts to include those steps. The documentation even says that we can do minor releases with dist-upgrade, so I’m not sure how we’re supposed to know that we need to do a full Image based OTA update prior to updating to r35.1.0.

Also the documentation links for r32.7.2 links to r32.6.1 - see here:
release page
linked documentation
This definitely lead me to be a bit confused. I iterated through all the documentation I could, find by messing with the URL, to see if there were any differences between them. W.R.T the OTA update I couldn’t find any major differences.

In the developer guide for r35.1.0, here, in the OTA update section, here
step 8. says to “Unpack the OTA payload package and prepare to start the OTA:”. We don’t need to unpack the payload from what I can tell, and the steps that proceed this also don’t unpack the payload - please clarify the documentation.

Could you please add some documentation about the./tools/ota_tools/version_upgrade/build_base_recovery_image.sh script in the developer documentation.

hello khan.schroder-turner,

FYI, it’s currently not supported for sw upgrade from rel-32 to rel-35 directly.
you may see-also discussion thread for two steps update as workarounds, (i.e. please make sure device is booting from chain A) for moving to r35.x release version.

in short, r32.x → r35.1 is not support now.

You might like to re-read the documentation.

It literally says:

image-based OTA currently supports updates from version 32.7.x to version 35.1

1 Like

Hi khan.schroder-turner,

The Image-based OTA update is support from r32.7.X → r35.1.
But not support from r32.5 → r35.1.
So if you want to update from r32.5 → r35.1, you can use topic-230704 workaround.
We will support this function in next r35.3 release.

I understand that, if you read my whole post I describe what I did. The documentation says you can do minor upgrades (i.e. XX.Y.Z) through dist-upgrade method.

See here

Then I proceeded with the image based OTA. Unless your documentation is incorrect - which is what I think is the case, and you wont acknowledge it - then the update process is broken.

Please read what I’ve written properly, I’ve spent a lot of time putting together the steps to try and save you time asking me for information so we can resolve this quickly.

Do you have any response as to why the method documented (and discussed) doesn’t work?

hello khan.schroder-turner,

we’ve test and confirm the steps of Image-based OTA update r32.7 .X → r35.1.
as mentioned in comment #11, please make sure device is booting from chain A before moving to r35.1

furthermore,
let’s follow-up Clarification of OTA Update Process for your combination of dist-upgrade and Image-based OTA for moving to r35.1

Thanks @JerryChang for reproducing and testing, happy to move the discussion over there.