There is probably more than one way. I can tell you about something which is non-traditional, and more used by kernel developers who might lock up the system.
There is something called the “Magic SysRQ” key bindings. Even when a system has more or less crashed and is locked up close to dead parts of it usually still function. The Magic SysRQ usually continues to function, and it is a bit of a “Swiss army knife” of useful things to work with a failed system. Access is either via key bindings (if there is a local keyboard), or via an echo of a character to “/proc/sysrq-trigger
”. If the process is to be automated, then obviously you will need the echo of a character method, but I’ll demo some of it using key bindings (unless this has been disabled you’ll be able to see this even on your host PC).
First, the parts which will be useful:
- How to
sync
(flush to disk unwritten content).
- How to remount the filesystem read-only (prevent new content).
- How to force shutdown.
Quite a while back someone pointed out that it is possible for sync
to still be in progress, but a second sync
won’t start until the first completes (using two sync
in a row guarantees one completes). On your host PC or on your Jetson with a local keyboard, monitor “dmesg --follow
”. Now type this keystroke combination: ALT-SYSRQ-s
(that’s holding the ALT
key down, then adding the SYSRQ
key…same key as the PrtScrn
, and then tapping the s
key).
As long as magic sysrq was not disabled (and it seems to never be by default, although there is a mask to reduce which events are permitted) you should see a message about emergency sync.
Now take a look at the permissions here:
ls -l /proc/sysrq-trigger
Only user root
an write to that file, and this is necessary for unattended or automated use of those functions (unless of course you have built your own custom keyboard capable of this). Any automation will need to already be logged in as root since it takes time to log in, and often a system issue means root literally cannot log in. So in a terminal go into a root shell, e.g., run the command “sudo -s
” (there are other ways as well).
Now, while still monitoring “dmesg --follow
”, type this command as root:
echo 's' > /proc/sysrq-trigger
Note that this is in part sort of how the watchdog timer works (they are not the same, but at lower levels they will share some of the code; a dead watchdog with a nearly dead system still requires code to function which can perform a shutdown or reboot).
Here are a series of events which can be performed via magic sysrq to do what you want:
- Sync twice (“
echo 's' > /proc/sysrq-trigger
” twice, or use the keyboard ALT-SYSRQ-s
twice).
- Unmount the disk, and remount it read-only: “
echo 'u' > /proc/sysrq-trigger
”, or “ALT-SYSRQ-u
”.
- Shut down the system: “
echo 'b' > /proc/sysrq-trigger
” or “ALT-SYSRQ-b
”.
Note that not all magic sysrq are available on all systems. Also, there is a mask to remove parts of this if desired. The file where this is usually edited is “/etc/sysctl.conf
”. If you see a statement like this in that file, then it is a bitwise mask (you’d have to convert to binary and then check the list of which bits mask which function):
kernel.sysrq=438
I see a comment in one Ubuntu system referencing this:
https://www.kernel.org/doc/html/latest/admin-guide/sysrq.html
(this is a nice guide specific to Linux; the standard covers other *NIX systems as well)
If you already have a process running as root, and if that process has a way to monitor that you have switched to supercap, then it could quickly echo ‘s’ ‘s’ ‘u’ ‘b’. In reality, so long as it has reached the ‘u’ event to go read-only, then the disk itself won’t corrupt. It depends mostly on whether or not there is time and power to sync the disk and switch to read-only first. However, being shut down is also preferable against things like power spikes, and so the ‘b’ event is also useful.
So far as other methods go, none are even remotely as fast (at least none that I know of which can save data). You could for example permanently set your disk to synchronous mode, and in that case:
- There would never be any outstanding data, so it cannot corrupt.
- Performance would drop so dramatically you might wonder if the system is running or failing.
- Solid state memory wear leveling would not be enough, and your solid state storage would fail quite quickly.
Having a read-only filesystem does not harm the system no matter how much reading you peform, so if you stopped with the remount to read-only so you could see what is going on, and if you don’t care about power consumption at that point, then you could safely not use the ‘b’ event.