Failed to operate Jetson Linux image-based OTA update

ota_20230609-152904.log (10.7 KB)
Hello,

I have been trying to do OTA BSP update (from L4T 35.2.1 to 35.3.1) on my AGX Orin but always been stopped at the same place during OTA process when execute the “sudo ./nv_ota_start.sh /ota/ota_payload_package.tar.gz” command.

The attached file is the log file retrieved from the /ota_logs/ folder.

I followed the exact steps in here to generate and trigger the payload and I didn’t enable the A/B partitioning feature.

Let me know if I need to provide other information.
Thanks for the help.

Regards,
Eddie

Hi ewei2,

Are you using the devkit or custom board for AGX Orin?

Does other procedure run w/o error? Do you fail at the 8th step on Jetson device?

dd if=/tmp/var_tmp.bin of=L4TDefaultBootMode-781e084c-a330-417c-b678-38e696380cb9 bs=8
Failed to write /tmp/var_tmp.bin to L4TDefaultBootMode-781e084c-a330-417c-b678-38e696380cb9

Could you help to share the result of the following command on your board?

$sudo ls -l /sys/firmware/efi/efivars/

Hi Kevin,

Sorry for the late reply.
Yes, I am using a AGX Orin devkit and I didn’t get any error on other procedures. Only when I run the 8th step.

I have attached the efivars and thanks for the help.

Eddie
efivars.list (3.8 KB)

It seems as expected.

Could you help to run the following command on your board and check if they can be run without error?

cd /sys/firmware/efi/efivars
xxd L4TDefaultBootMode-781e084c-a330-417c-b678-38e696380cb9
echo -ne "\x07\x00\x00\x00\x03\x00\x00\x00" >/tmp/recovery.bin
sudo chattr -i L4TDefaultBootMode-781e084c-a330-417c-b678-38e696380cb9
sudo dd if=/tmp/recovery.bin of=L4TDefaultBootMode-781e084c-a330-417c-b678-38e696380cb9 bs=8
sudo chattr +i L4TDefaultBootMode-781e084c-a330-417c-b678-38e696380cb9
xxd L4TDefaultBootMode-781e084c-a330-417c-b678-38e696380cb9
sudo reboot

Hello Kevin,

I copied your provided commands into a bash script and it seemed like it was going to reboot the device but it couldn’t. The screen was blackout and unreponsive.
Few minutes later, I tried to reset the devices by using the physical power button, but the device is currently not getting booted anymore. The LED is on and the NVIDIA logo shows for a sec and then everything goes black. Does it mean I have to flash the device to save it?

Eddie

Hello Kevin,

There is one thing I noticed that might cause the error execution at the 8th step but I couln’t understand how if it was the case.

Actually during the weekend I did successfully update the BSP by transfering the payload and your OTA toolkit to the edge device and executing all the steps LOCALLY on the terminal. Which should validate that the payload is generated properly and the what I did on the edge devices are correct as well.
But the experiments I had done that bumped into troubles at 8th step, since I aim to do this over-the-air, the procedures (Step 1~8) were put into a script and executed by a remote command to fork a process to then handle the script execution in the background. But it stopped at the 8th step as we saw in the log.

The only difference I can see between these two ways to trigger the payload to update, is I was not giving it a stdout/stderr interface when I executed the steps in the background. But how does that make sense?

I hope this would help. Thanks.

Eddie

It seems you could update the BSP by running it step-by-step manually on the terminal, but it would fail at Step8 if you are using a script to run the update procedures instead.
Is that correct?

Do you make sure your script having permission or run it with sudo?

Yes, you are correct.

And yes, I did run the script with root permission. Actually I have used the same way to operate OTA BSP Update with other L4T versions on different target devices many times and they all succeeded. I may fairly say I am quite familiar with the procedures but this is the first time I see this error on this specific L4T 35.2.1 to 35.3.1 @ AGX Orin.

I attached the script and hope that would help.

Thanks.

Eddie
BSP_ota_deploy.sh (646 Bytes)

Also, I have tried another experiment. When I run the script remotely by:

$sudo ./BSP_ota_deploy.sh =====> Stopped at 8th step.

$sudo ./BSP_ota_deploy.sh > /tmp/ota_log 2>&1 =====> Worked!

It seemed like it needs stdout/stderr interface just like executing it in local terminal. And I really don’t understand why because it could always work with other versions on other devices without adding that argument.

Could you also add sudo for Step8 in your BSP_ota_deploy.sh?

sudo ./nv_ota_start.sh /dev/mmcblk0 /ota/ota_payload_package.tar.gz

Would it work with script on AGX Orin?

Ok, I would add and try that test later.

But yes, " $sudo ./BSP_ota_deploy.sh > /tmp/ota_log 2>&1" this script execution would work on AGX Orin.

Hi Kevin,

I have tried to add “sudo” for the 8th step in the script but with no luck.
It bumped into the same problem and the log is attached.

Let me know if you have other thoughts. Thanks.

Eddie

ota_20230613-141345.log (10.7 KB)

Could you try to give full permission to the script and run it?

sudo chmod 777 BSP_ota_deploy.sh

and please use the following command in Step8 instead. (remove using “/dev/mmcblk0”)

sudo ./nv_ota_start.sh /ota/ota_payload_package.tar.gz

OK, I’ll try and report.
Thanks for the advice.

You could also try removing set -e in your script.

Hi Kevin,

I tried removing “set -e” in my original script (i.e. I didn’t add sudo at the 8th step or make privilege permission to the script itself) and it worked. Does that mean the 8th step did complete with some errors but they could be ignored?
I am insecure about not adding “set -e” in the shell script but I guess that’s the current workaround (or I add stdout/stderr at the end of the execution command).

I still feel something is not making sense and just would like to mention that:

  1. That issue didn’t happen on other versions and devices I had been working with.
  2. Still wondering why my original script can work as long as there is a output file to catch the log while NV OTA Toolkit has its log mechanism already.
  • $sudo ./BSP_ota_deploy.sh > /tmp/ota_log 2>&1 =====> Worked!

Again, thank you for the help.

Eddie

Thanks for your update.

Yes, it seems could be ignored.
If you still have concern about it, maybe you could try comparing the logs for the detail of errors causing it is stopped in step8.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.