OTA update in JP 4.3

Hi,

According to this documentation:

https://docs.nvidia.com/jetson/l4t/index.html#page/Tegra%20Linux%20Driver%20Package%20Development%20Guide%2Fquick_start.html%23wwpID0EVHA

It seems that in JP 4.3 you can upgrade different components of L4T using deb packages.

Few questions about that:

  1. Is this the future direction for L4T/JP upgrades? will we be able to upgrade to next versions of JP this way? let say JP5 is released (let say with Ubuntu 20.04 LTS), is nvidia planning to use regular debian upgrade for that? including kernel, DTB, bootloader, firmware, etc?

  2. Is there any fault recovery/redundancy mechanism in case the upgrade fails? any recommended way to do this?

In our use case (and many outdoor industrial applications are the same) we have no physical access to our devices so we must use OTA update which are fault-tolerant because replacing the devices requires a major capital cost.

  1. Is this the future direction for L4T/JP upgrades? will we be able to upgrade to next versions of JP this way? let say JP5 is released (let say with Ubuntu 20.04 LTS), is nvidia planning to use regular debian upgrade for that? including kernel, DTB, bootloader, firmware, etc?

We cannot guarantee this big update. Currently, it would work for small upgrade like rel-32.x to rel-32.x. It covers the whole BSP (kernel/dtb/bootloader/fw).

  1. Is there any fault recovery/redundancy mechanism in case the upgrade fails? any recommended way to do this?

The A/B redundancy mechanism should prevent the failure.

  1. OK so this isn’t a proper OTA solution… In next Jetpack version you expect us to do again a flash of the device? I thought nvidia understood it is not a viable solution for an edge solution.

  2. Is the a/b redundancy integrated to the Deb package update? Or we need to enable a/b and then manage it our self?

Hi urielom8ug,

I would say you don’t need to be pessimistic about it. The only comment in #2 is that we cannot guarantee. It does not mean there would be no such feature.

  1. Is the a/b redundancy integrated to the Deb package update? Or we need to enable a/b and then >manage it our self?
    The deb file would handle this part automatically.

WayneWWW - in the industrial/edge world you can’t afford this kind of things, not understanding this means nvidia still doesn’t understand this market.

Let say you release a device with the belief that you can OTA it, deploy 1000s of devices (our real case, not a theory) and next year nvidia releases JetPack 5 with CUDA 11 and critical fixes to gstreamer (now think about that with CUDA10 and JetPack 4 and us being in JP 3.3 and this just happened to us) but you can’t upgrade your devices because there is no OTA from JP 4 to JP5 so now you are stuck and need to work around your stuff.

Using deb packages for OTA coupled with automatic a/b redundancy enabled by default is the right direction and nvidia already did the major work around it, so why not commit to it?

Regarding the a/b redundancy - just to be sure I understand:

  1. a/b redundancy is enabled by default (not like JP3.x)
  2. On first boot you are using slot a
  3. Upgrading a deb package of L4T updates slot b (or slot a and b is the backup?)
  4. Reboot the device and it will try to boot with the updated package
  5. Fails to boot (what is defined as failure?) then it will go to the other slot and in theory because it was the previous status before the upgrade it will boot properly.

Did I understood it correctly? can you control the definition of boot properly (for example I would need to make sure it connects to the network because if not then the device is bricked)

Hi urielom8ug,

For your question about the direction.

I am sorry to give out such conservative reply.
OTA is just a new feature released recently. Some other features are still on-going. Thus, I can only say the direction is not yet fixed.
As a forum admin here, my responsibility is to deliver your request to internal team so that they can hear you or try to fix the bug you reported. I cannot give promise like “Yes, your feature would be support in JP4.6” when the situation is still uncertain. If you have a great business request for OTA, please also try to contact nvidia sales to highlight. This would also make the process faster.

AFAIK, this deb update method should cover at least all rel-32.x -> rel-32.x update in future. As for something like k5.x/ubuntu 20.04, I still don’t see this plan so can share nothing.

For bootloader, the update support fail-safe update.

If update failed, the system still can bootup, and bootloader will use the redundancy one, or another partition slot. The Debian Package system will show install failed, and users can install the package to update again.

Thank you for the honest answer, as of now this OTA is not enough for our use case (and many others) once nvidia commits to this (and stop asking users to flash devices via USB for updates) it will be a big step in the right direction for the ecosystem.

A feature that I recommend to have is to add a post-upgrade script that if it fails the devices goes back to previous slot/update.

An example of why you need something like this in a OTA mechanism is that sometimes the OTA (let say kernel update) will succeed and the device will boot but for example your wifi/network/lte card will not work and then the device is effectively bricked (no way to update it anymore).

Having a post-upgrade script allows the user to validate that critical aspects of the system (from the user perceptive) still function correctly after the upgrade and prevent soft-bricked cases.

Yes, I understand what you are trying to point out. This would be delivered to internal team to enhance the OTA in future release.