Failover rootfs partition

Our system is configured with two rootfs partitions, and we upgrade the allternate (non-running) rootfs partition, then modify the extlinux.conf to point to the alternate partition before rebooting into it.

Is there a way to add both partitions to the extlinux.conf in such a way that if the first primary partition fails to mount, we fall back to the second primary partition?

It would be great if this could be done without modifying/rebuilding the bootloader.

I imagine if it’s possible at all, we’d need to remove the rootwait kernel argument.

Thank you!

I’ll have to have a look at the bootloader source.

In older releases one could just add a second entry to extlinux.conf and pick this during boot via a serial console cable. You’re probably still using R28.1, in which case this should be possible. However, automatic failover would involve changing U-Boot environment variables.

If you boot to the U-Boot console you have access to environment variables. If you type “help” you can get a list of commands, one of which should be “printenv” (and it would be useful to turn on serial console logging prior to this to have it in a file for reference).

Note in particular that the command to continue booting is literally “boot”. The “help” tells you this runs macro “bootcmd”. This in turn expands and runs more macros, e.g., “distro_bootcmd” (see “printenv bootcmd” and then “printenv distro_bootcmd”). The previous “printenv” would have printed these out.

Eventually you will see boot targets as a macro where upon finding a bootable partition that partition is used for configuration (the extlinux.conf there is used). That extlinux.conf could point at anything, so this isn’t the rootfs necessarily (although often the extlinux.conf partition is the same one used as the rootfs this isn’t a rule). And herein lies the trouble…knowing if a partition is bootable is only by means of whether the ext4 can be read, and if the extlinux.conf exists in one of the optional locations, e.g., in “/boot/extlinux/extlinux.conf”. For an automatic failover to occur without a serial console assist extlinux.conf must be missing.

So a lot of what you want to do depends on how you define a failed partition (and how you modify U-Boot to use your idea instead of the simple existence of extlinux.conf). You could modify U-Boot to search for another custom file, and have that file created (an empty file created by “touch”) upon shutdown, and if shutdown was abnormal, then the file would not be created (and startup after finding the file would remove the file). Then the failover would boot instead (and the failover would always have this file and the file would never be removed under any circumstance). The failover, upon boot, could in fact write the temporary test file in the first root partition. It isn’t the best way to do it, but it would be something you could get working and then improve it by changing the test condition for whether or not to fail the first partition.

FYI, if you were to modify auto detect to require some second file, then I would make sure serial console is still allowed to skip testing for that file. Give U-Boot serial console the ability to ignore any special test condition and forcibly boot whichever entry the user at the console wants to pick.