Storing all important software on and SSD so in the event of a mistake we can re-flash and continue without have to reinstall software

Good day everyone, my team and I experienced some issues where we had to re-flash our Jetson AGX.

To avoid having to reinstall packages if this occurs again in the future, we looked into copying(rsync) the /usr and /home directories to separate partitions in an SSD card and mounting them to the /usr and /home locations.

Now, we are questioning what other folders can we copy to the SSD to ensure that if we need to re-flash we can just re-mount these folders and continue as if nothing occurred?

Also, are there any other precautions do we need to take? For example when setting up should the user have the same name and password, or do you go into the terminal and edit fstab to mount these partitions before user setup?

Or are there any better procedures to follow?
Thank you.

Any directory which is not participating in boot should be able to use this method. The “/home” should work, but the “/usr” directory would fail. Most CUDA-related content would be in “/usr/local”, and “/usr/local” should be ok to mount this way. The main trick is that in the “/etc/fstab” file the mounts should use option “nofail” so that if the partitions are not present, then boot continues anyway, but if the partitions are present (such as when recovering from a disaster), then the partitions do get used. I do this on my main PC for “/home” and “/usr/local” and for “/var/www”.

If we pretend that your running system normally used “/dev/mmcblk0p1” for the root filesystem, and if we pretend that a partition “/dev/sdb1” contains a copy of the home directory, then an fstab line might look like this:

/dev/sdb1   /home   ext4   rw,suid,dev,exec,auto,nouser,async,nofail  0  2

Note that the option “defaults” is equivalent to all except the “nofail” part:
rw,suid,dev,exec,auto,nouser,async
(see “man fstab”)

The last argument (field 6) is used by repair software to determine order of repair. One would repair secondary mounts after the parent filesystem is mounted, thus it is 2 (rootfs would be 1). If I had yet another partition, I might choose to make this “3” for that next filesystem. Most of the time this won’t matter, but if for example power is shut off without a proper shutdown, and if filesystems need repair, then one would want to always repair the parent filesystem first.

The next-to-last field (field 5) just a “0” if you don’t automatically dump this during a system backup, but a manual dump naming that location would still dump anyway, and it is likely you won’t run automated general dumps.

Note that you could remove the “nofail” if you want a failure of those partitions to always cause boot failure. I tend to copy the subdirectory content (such as content under “/home”) of the running system to the backup partition when the partition is at some alternate mount point. Then, when you mount that partition over the actual backup point (e.g., on “/home”), the new partition will look just like the old partition. If the new partition is not mounted, then it will revert to the old content. If you update the new partition over time, then that content will begin to differ from the original partition, but that partition is a nice rescue environment.