I wish to conclude what we know so far in order to “build a solution” for flashing via Ethernet:
I need to boot to a functioning rootfs (I can use initrd or A/B rootfs)
I need to sign each image I’m going to flash (except the rootfs)
My question is:
Can I conduct an experiment in which I use dd (or other suggested method) to backup the whole internal qspi partitions (bootloaders images and more) and then backup the whole NVMe partitions and then restore them?
The goal is not using USB during this procedure
An alternative way would be to use the regular flash.sh and other tools to create and sign all images (internal qspi and NVMe) and then copy them to a functional rootfs with enough free space and use dd to copy them into internal qspi and NVMe partitions
QSPI is not available through normal mechanisms. I believe that since this includes support for security, e.g., SecureBoot, and signature checking, that you would be denied access to in a normally running system. Anything on the NVMe or rootfs is easy to copy. Everything else gets dramatically more complicated.
So basically I’m stuck. I can’t flash the entire system with Ethernet.
If I reduce my solution to only flashing the boot device (emmc, external nvme) via Ethernet,
is it possible to do so from a host rather than an operational image on the target itself?
I’m asking this for the case that both my rootFS(A and B) images are corrupted and I need to use Ethernet to restore them.
eMMC models can have a lot updated over ethernet, but not everything. If your QSPI is already sufficient, then you don’t need to update that for eMMC models. Most of those other partitions would remain on eMMC, but layout for external changes a lot. “Mostly” you are stuck, and if you don’t have a lot of patience and luck and a spare similar unit to test on, then it will fail. If the unit needing update does not fully boot, then you also will not succeed. It is possible to update the eMMC and NVMe with a lot of risk. Without the system already being bootable though, there is no chance over ethernet.
I suppose one very far outlying case is if your system is set to look for network boot (PXE or NFS) prior to finding the failed filesystems. That would be a form of “still bootable” by offering NFS or PXE boot. The odds are not good that this would work.
Do you have another system with basically the same setup you can test with? What is the exact L4T release being used?
Correct. You won’t be able to update QSPI. Note that if QSPI content is of a sufficient release version, that many of the other partitions or content will work with this. Going across major releases tends to mandate a QSPI release. Unless there is a bug correction or corruption involved, then QSPI can usually be ignored for many updates of minor releases and other issues.
Updating non-rootfs eMMC partitions is possible if and only if your binary content is properly signed, and fits in that space. There are “--no-flash” options which will create signed versions of non-rootfs partitions. Normal flash will create this, flash, and then delete them. By keeping these and not actually flashing there is an option for those partitions to sign them. This won’t be of any use if the partition you create won’t fit into the original partition.
Rootfs is not signed. So if your rootfs fits, and if the rootfs being replaced is not locked and in use, then you could also overwrite a rootfs (doesn’t matter if it is an “A” or “B” of “a/b” scheme).
Keep in mind that when you use the term “flash” you imply recovery mode, and not a running system. Recovery mode flash implies the absolute only access you have for modifying the Jetson is via the flash software over USB. A fully running system does not get flashed. Programs such as dd can overwrite non-locked content on a fully running system, and this can include for example via an ssh tunnel or rsync. There are many tools on a fully booted system which you can pipe data in from either locally or remotely, and pipe that same data out to overwrite something in eMMC, e.g., piping a signed partition from a host PC via a combination of dd and ssh to overwrite an eMMC partition on the running Jetson.
and 5. Be careful that flashing partitions on external devices probably also means keeping a consistent initrd. If that NVMe partition is not locked and in use, and if any initrd is still valid, then you could remote update an NVMe partition. You won’t find the signed partitions on the NVMe, but I know some people have worked on being able to do new tricks with the JetPack 6/L4T R36.x+ releases, so there are probably a few things you can do with that release that I am not familiar with. If we restrict ourselves to rootfs, then there are a lot of things you can do on any release (not necessarily safe things). For example, you might find device tree content on eMMC rootfs, eMMC non-rootfs partitions, or sometimes on the NVMe rootfs. The initrd can greatly complicate this. Some files will be in use on the NVMe if fully booted, and those too cannot be updated. rsync is a good update tool for a rootfs where you are keeping that release but updating files. It won’t matter if this is being done from a remote host through a network pipe or if it is being done locally, a locked critical file is still locked, and messing with the wrong part will mess up™ and destroy your software in a way that mandates a full flash.
Flash means recovery mode. Editing or replacing content can occur with flash to any part. Editing or replacing on a running system does not necessarily work to all parts, and the parts which are replaced put the system at a very high risk of failing. You need a similar system to test and practice on. If you don’t have that then it is near certain that you’ll end up with an unbootable system which mandates USB flash. Locked content which is currently in use won’t care about method, it will fail. In recovery mode there is no locked content. There is no content in recovery mode which would fail from conflicts.
Yes, post your experiments. Mention any concerns before a step if you plan on doing something with data that is important. Having a clone of rootfs before starting on a second system is a good way to start, but beware that it takes a very long time to copy files that big (and you wouldn’t subject your only copy to something that might harm it). Little steps are a good idea.
Thank you for your help @linuxdev.
Currently our team decided to go on another solution which not requires flashing with Ethernet.
I’ll close this thread for now. If it will be an issue in the future I’ll come back here and post here the procedure. Is it ok by you?