How to upgrade deployed tx2

Is there a method to upgrade a deployed tx2 without direct connection?

We deployed several tx2 to remote site. Thus, connect and using jetpack to update will not work for us.

Currently, our own software updates are distributed through APT debian packages. But we don’t know how to upgrade the base system image.

I do not aware of methods to get the upgrade via APT,
but I have a breezy feeling, that once I succeeded with dd raw disk image to jetson tk1 over the network. But It was a long time ago, and by now I can not recollect the event in details and determine if it has actually happened or I just intended to do the raw image transfer via dd, and therefore I can not confirm that the method will work.
Perhaps having a couple of jetsons it will be possible to confirm by experiment if the below will do the trick over the network

dd if=/dev/mmcblk0 | ssh nvidia@jetson2 dd of=/dev/mmcblk0

Upgrading rootfs by that method would work if there are no changes to kernel or device tree…or any of the non-rootfs partitions. Most releases these days change all of those…the TK1 releases are older so they did not have that issue.

The newer TX2 releases would fail when using this method due to how partitioning and boot work together with device tree (and in the case of R28.2 the signing of device tree). Something close to this could possibly be made to work though…

If you were to create a fully upgraded system locally to you, and the entire eMMC with dd (the clone mechanism will clone partitions, but it won’t clone the metadata at the start of the disk), and then get the Jetson booted to alternate media, this would probably work (but it’d be risky). The trick is how it would get booted to alternate media so dd could run on the eMMC.

One example would be to reboot it with a rescue SD card…but then that probably requires someone to be there in person. Once the operation is complete, then someone would have to remove the SD card. I could see a scheme where one leaves a blank SD card in, and then uses the internet to stream the rescue image, reboot it, use the rescue image to stream to eMMC just like it had the SD card, and edit the SD card to boot to eMMC once again…followed by erasing the SD card remotely so the bootloader would stop trying to use it. It’s complicated and full of places it could go wrong. And if there is no existing SD card installed, and no USB2 SATA drive on the full-sized USB connector, then no alternate/rescue image can be used without being there in person.

My guess is that if the rootfs itself were being overwritten with dd while it is in use you’d end up with a kernel panic or a corrupted file system. Maybe…if the system could be rebooted read-only, then you could get away with streaming an image over the existing mmcblk0. Getting read-only without a serial console and getting it wrong would be a one-way trip to needing personal intervention since it means extlinux.conf couldn’t be edited without getting rid of read-only…which in turn would probably require serial console if an outright flash isn’t going to be required to get rid of read-only.

Thank you for the ideas.

rewrite the rootfs when linux is running probably will not work.

I may have to split the rootfs into two partitions and alternate between the two partitions using uboot.

currently, for this kind of upgrade, we let the boards boot into sd card, and replace the sd card in the remote site. but when we have more devices this method will be a burden for the remote site.

I’m curious how other solve the problem.

Something to keep in mind is that there appears to be some dual partition scheme being developed, but not yet complete. If you look at the partitions of R28.2 (“sudo gdisk -l /dev/mmcblk0”) you’ll see this dual partition scheme. Whatever I suggest here might be outdated and irrelevant in the future. Even so, it might be useful to think about the following…

There would be a big advantage if you were to have a separate “/boot” partition versus the rest of rootfs (I have not tried to do this, but it is tempting). Think first about when “/boot” is used by what software, and then what and when the rest of “/” is used.

U-Boot will look for an extlinux.conf in the partition named for destination during flash (e.g., “sudo ./flash.sh mmcblk0p1 jetson-tx2” means extlinux.conf must exist in mmcblk0p1). U-Boot does not care about anything but the “/boot” subdirectory of this partition. If you use a serial console and interrupt U-Boot and go to the command prompt, then environment variables will show you the search order and search devices of looking for extlinux.conf.

Once the Linux kernel runs, the root partition will be looked for in the “root=” entry passed in the APPEND key/value pair of extlinux.conf, but this value was already read by U-Boot and Linux will never read this from disk. “/boot” could be deleted after U-Boot has read in everything, and Linux would never know nor care. It just happens that by default convention “mmcblk0p1” contains both “/boot” and the rest of the rootfs, but it isn’t mandatory.

If you were to flash normally, but have no sample rootfs other than “/boot”, and you were to manually name the rootfs size with the flash.sh parameter “-S” (e.g., “sudo ./flash.sh -S 1GiB jetson-tx2 mmcblk0p1”), then you would essentially have a tiny “/boot” partition and no rootfs. The extra space could be added to the end of the eMMC at a later time using gdisk or gparted (you might want to preserve other partition sizes and order and labels), but you’d need to figure out a way to get the extlinux.conf to name a new “root=”.

During flash the “rootfs/” subdirectory on the host PC is used almost as an exact copy. Depending on arguments to flash.sh extlinux.conf will be edited…you’d need to edit the file copied in rather than edit the “rootfs/boot/exltinux/extlinux.conf” since any edits there are lost during flash. For example, when flash.sh is told to flash to mmcblk0p1 it copies “bootloader/t186ref/p2771-000/extlinux.conf.emmc” to “rootfs/boot/extlinux/extlinux.conf”. Editing the “.emmc” version would result in propagating your changes into “/boot/extlinux/extlinux.conf” of the final flash image. You could edit this for example to have “root=/dev/mmcblk0p29”. In theory, when Linux runs, this would be your rootfs. Perhaps you also have “/dev/mmcblk0p30”, and both are duplicates…or perhaps one is an upgrade mostly similar to the first…then a serial console could be used to select between entries. Or a “default” could change which entry is used when no serial console intervenes. Example entries:

TIMEOUT 30
DEFAULT <b>test1</b>

MENU TITLE p2771-0000 eMMC boot options

LABEL primary
      MENU LABEL primary kernel
      LINUX /boot/Image
      APPEND ${cbootargs} root=/dev/mmcblk0p1 rw rootwait rootfstype=ext4

LABEL test1
      MENU LABEL test rootfs1
      LINUX /boot/Image<b>_rootfs1</b>
      APPEND ${cbootargs} root=/dev/mmcblk0p<b>29</b> rw rootwait rootfstype=ext4

LABEL test2
      MENU LABEL test rootfs2
      LINUX /boot/Image<b>_rootfs2</b>
      APPEND ${cbootargs} root=/dev/mmcblk0p<b>30</b> rw rootwait rootfstype=ext4

LABEL sdcard
      MENU LABEL SD Card
      LINUX /boot/Image
      APPEND ${cbootargs} root=/dev/mmcblk<b>1</b>p1 rw rootwait rootfstype=ext4

LABEL sata
      MENU LABEL USB SATA
      LINUX /boot/Image
      APPEND ${cbootargs} root=/dev/<b>sda1</b> rw rootwait rootfstype=ext4

…copy relevant kernel “Image” files in to “/boot” of mmcblk0p1 so you can have independent kernels. Use a rescue SD card to add mmcblk0p29 and mmcblk0p30.

If you want to alternate active rootfs, then the extlinux.conf of the previous partition must always survive and point to a valid rootfs if your upgrade is to be “safe”. U-Boot environment determines where that extlinux.conf must be. Various non-rootfs partition content and environment variables can change if U-Boot survives due to boot environment setup requirements. If you keep “/boot” pristine and protected, then in theory if U-Boot itself survives testing, then you can afford to make mistakes (without a serial console mistakes would still be at minimimum “inconvenient”). Under the assumption that two rootfs images work using the same device tree and other startup initialization (cboot, mboot1, mboot2, so on), then there is no reason you can’t dual boot and experiment with one of the rootfs partitions while keeping your “/boot” partition protected.

Your life would be much easier if you had something like an ethernet serial UART console adapter.

Thank you for the detailed explanation.

I think I will try it when we have time. the following is the note I write in our internal github. Posting it here, hope it may be helpful for others.

use sd card to hold boot and 2 rootfs.

partition layout:
mmcblk1p1 - /boot (read only)
mmcblk1p4 - /boot2 (read write)
mmcblk1p2 - /rootfs1
mmcblk1p3 - /rootfs2

reasoning:

  • if any thing goes wrong, we can ask the remote site to replace the sd card
  • the build-in 32 GB is not large enough for two rootfs in our case, we record videos
  • I highly believe the old u-boot can load any new kernels without problems. After all, it’s just a boot-load and out board is not going to change
    (may need to check the signing part and disable it if necessary)
  • if possible, want to set /boot to read only and use a separate partition for changing config

/boot will be only for uboot and the extlinux

save the kernel on rootfs1/rootfs2 and use dd image instead of tar
reasoning:

  • the cuda depends on kernel, (e.g. install cuda-9 from jetpack 3.2 to L4T 28.1 rootfs doesn’t work, error on driver version whn program start) We don’t know where the driver is stored. I believe there are both kernel drive and user land libs. this incomplete of information is too risky for remote deploy in our opinion.
  • dd image is easier to do the checksum
  • we can try some fall-back logic in the feature. we have physically button on the GPIO, so, we may able to modify the u-boot code to check the button

upgrade procedure:

  1. download root image
  2. flash to whole image to the other partition
  3. checksum the whole partition
  4. change extlinux
  5. reboot

use APT to distribute app update

reasoning:

  • the precious procedure will take a while, we can schedule a upgrade event in the maintenance window, the update should not be more than once per half year
  • we already have a working apt distribution system based on aptly and s3

Beware that where the boot loader looks for extlinux.conf can change during write of other partitions and an interrupt of the write can leave the bootloader unreachable. One illustration is if you flash to mmcblk1p1 (SD card) versus to mmcblk0p1 (eMMC). When U-Boot runs this option is where boot looks for an ext4 file system with the extlinux.conf (the “/boot”), but there is a handoff between stages and it isn’t until you get to U-Boot that you have control. Once in U-Boot you can use environment variables/macros…I’m thinking if your write goes bad something could stop the bootloader from ever being reached (e.g., power outage during update). You won’t be able to attempt a second update if no boot device can be reached because of a corrupted environment.

In the serial console after interrupting U-Boot experiment with:

printenv
printenv boot_targets
printenv bootcmd
printenv distro_bootcmd
printenv bootcmd_mmc0
printenv bootcmd_mmc1

Note that you can test for one boot if you “setenv whatever=whatever” if you want to experiment with environtment in a non-destructive way. If you run “saveenv”, then it is saved and lasts. Writing a partition and then failing could leave some of this unbootable until flashing the old fashioned way with USB cable. Writing partitions can essentially be the same as writing to these environment variables.

One note of caution…sometimes network settings can be such that the MAC address is saved from a clone or created from a source such that your image could force the same MAC address on every unit flashed this way. Do check ifconfig after you’ve tested a couple of flashes and see if the MAC address remains constant, or if they have all converted to the same MAC.

I am not going to write the any partitions on the emmc.

Thus, for the “write goes bad” part, the only critical section is on step 4 of changing extlinux on the sd card, which should be fast, and almost atomic (the mv trick). And as I said, if it really goes wrong, we can ask the remote site to replace the sd card. If only handful of devices instead of hundreds of devices need to be replace manually, the remote site will be happy.

Could you give more info on that cloned MAC address part?
The only place I can think of is the Network Manager’s clone-mac config. We distribute the system-connections in apt, so should be ok. Do you know other places that will change the mac?
I have not read the tx2’s source code yet. But logically, I think mac would be read by the pre-boot-loader hardware initialization code. In my case, that part doesn’t change over the update process. so this would not be a problem for me right?

I still can’t believe there are no official way to upgrade deployed system.

For embedded system, I do not expect that there are no way to upgrade the system over the network.
It is just not practical to ask an industrial environment to do the physically usb connection upgrade in scale.

We need the upgrade because of TensorRT’s tensorflow support and for newer version of zed camera SDK.

In case you will be upgrading Zed camera firmware you will need to connect Zed camera to Windows PC. So far it seems to be hardly possible to upgrade Zed firmware from Linux system, as it seems to me.

@Andrey1984: I did confirm from StereoLabs that they intend for ZED camera firmware to only be able to have an upgrade from Windows. But the firmware they talk about is that within the ZED rather than the “/lib/firmware/”. If the ZED had a previous firmware upgrade, then a clone of the Jetson won’t be an issue (assuming the right release was used in the first place…the whole “/usr/local/zed*” still has to be the right release, but you can run multiple versions of this “/usr/local/” content…I just don’t know why anyone would).

NOTE on ZED: The “/usr/local/zed*” content is just a file copy operation, there won’t be anything special or different about producing this via clone versus via a direct install.

So far as MAC goes there are times when udev might have added a rule for your network which took the MAC from your current address and caused this to be used when it renames an interface. I’m not sure when such a rule is added by udev, but if it occurs it tends to be part of “/etc/udev/” (you can see several rules there which all kinds of services update). This does not always occur, and in fact most of the time probably doesn’t. If it does happen, and you don’t know it, and you restore a clone from one Jetson to the other, then the udev rule will also restore the previous Jetson’s MAC address…this could cause some very strange and hard-to-track errors for two devices on a network to claim the same MAC address.

When you run “ifconfig” (e.g., “ifconfig eth0”) the MAC address shows as the HWaddr (six hex bytes). If you test to restore a clone once to a different Jetson and check the HWaddr of ifconfig, the addresses should be unique from the prior Jetson. If not, then look in “/etc/udev/” for that repeating MAC and erase it or update it to the new hardware.

I have seen this occur, and if you clone a large number of Jetsons and put them on the same network, then ifconfig will end up with “collisions” and odd issues. You only need to check a clone once to see if it caused a duplicate MAC. If the interface is named “eth0”, or similar by that convention, it seems the odds of udev being involved goes down (or away…but I don’t know). If the interface has been renamed (e.g., I have a Fedora system with multiple NICs, one is enp4s6 due to udev), then the odds go up this might happen. R28.2 doesn’t seem to rename, but some of the releases do…you have to check this each release.