SD card damage protection

Note that I performed the cpio unpack in an empty directory , with the source of the unpack being in the parent directory (two “ .. ”, not one “ . ”). The reason for this is that unpack goes straight to your current directory. You’d end up repacking it with other content not intended to be part of the new initrd.

With this in mind, isn’t this exactly what happens now?

gunzip < ./initrd.img-4.9.201-tegra > initrd-4.9.201-tegra.cpio

This would unzip the thing into the same dir, or?

My bootconfig BTW:

nvidia@jetson:~$ cat /boot/extlinux/extlinux.conf
TIMEOUT 30
DEFAULT overlayfs

MENU TITLE L4T boot options

LABEL primary
      MENU LABEL primary kernel
      LINUX /boot/Image
      INITRD /boot/initrd
      APPEND ${cbootargs} root=/dev/mmcblk0p1 rw rootwait rootfstype=ext4 console=ttyS0,115200n8 console=tty0 fbcon=map:0 net.ifnames=0 

LABEL overlayfs
      MENU LABEL overlayfs
      LINUX /boot/Image
      INITRD /boot/initrd-4.9.201-fixed-tegra.img
      APPEND ${cbootargs} root=/dev/mmcblk0p1 rw rootwait rootfstype=ext4$


# When testing a custom kernel, it is recommended that you create a backup of
# the original kernel and add a new entry to this file so that the device can
# fallback to the original kernel. To do this:
#
# 1, Make a backup of the original kernel
#      sudo cp /boot/Image /boot/Image.backup
#
# 2, Copy your custom kernel into /boot/Image
#
# 3, Uncomment below menu setting lines for the original kernel
#
# 4, Reboot

# LABEL backup
#    MENU LABEL backup kernel
#    LINUX /boot/Image.backup
#    INITRD /boot/initrd
#    APPEND ${cbootargs}

nvidia@jetson:~$

To recap: 3 ways, 3 identical results:

  1. Lehni script (which just basically creates a new initramfs and enters this as primary boot device)
  2. My blind approach to enter the found overlayfs into the boot config as primary device
  3. Your very sophisticated approach which does… something.

All three attempts lead to an endless boot loop.

102 posts after :)

Could you check this sequence?

FYI: Tried again. From scratch. Same result. Doesn’t work. End of lane :)

TNX

When it reboots then here:

[ 0.018192] bootconsole [uart8250] disabled

Here the reboot happens.

In OK boot this follows, so this doesn’t happen in non-ok boot.

[0000.159] [L4T TegraBoot] (version 00.00.2018.01-l4t-e82258de)
[0000.164] Processing in cold boot mode Bootloader 2
[0000.169] A02 Bootrom Patch rev = 1023
[0000.173] Power-up reason: software reset
[0000.176] No Battery Present
[0000.179] pmic max77620 reset reason
[0000.182] pmic max77620 NVERC : 0x0
[0000.186] RamCode = 0
[

…there are some odd automatic formatting issues at times. I am typing two dots inside of a “code block”, so this may be why it preserves two dots for me.

…the above would decompress, but gunzip isn’t extracting all of the files archived in the .cpio file: The produced “.cpio” file is just a single file. The ’.cpio file is itself just a single file regardless of whether it is compressed or not. As soon as the “.cpio” file is processed by “cpio -vid”, an entire filesystem is unpacked as a directory tree. You can decompress anywhere and still have just one file, but you don’t want to extract the filesystem in the same directory where the source “.cpio” file is located.

Summary:

  • initrd.img-4.9.201-tegra is a compressed cpio archive file.
  • initrd-4.9.201-tegra.cpio is an uncompressed archive file.
  • cpio -vid < ./initrd-4.9.201-tegra.cpio” leaves the individual .cpio file where it is, unchanged, and also unpacks the entire tree of files, with the original .cpio still there. As a result, if you were to repack this archive, then you would have an extra file stored in your new archive which you don’t want: “initrd-4.9.201-tegra.cpio”.
  • cpio -vid < ../initrd-4.9.201-tegra.cpio”, with two “..” periods, is also an entire directory tree of files, but the original unchanged “.cpio” file is in the parent directory and won’t collide with the unpacked tree. As a result, repacking this contains just the desired file tree, not the .cpio file too, and this is correct.

You should post the full serial console boot log from the “new” failed case. This might allow adding debug output to be added in the init of the initrd. This is one reason to know how to unpack, modify, and repack the initrd: To add debug statements.

Also, in the “new” failed case, always include the output (if it boots) of “lsmod”.

So in other words: Your gist as provided is OK now?

Here the OK boot case:

boot_ok.txt (81.4 KB)

Here the FAIL boot case:

boot_fail.txt (71.6 KB)

It hangs a while (5 s) after this line:

[ 0.000000] bootconsole [uart8250] enabled

then it reboots

lsmod in case it comes up:

Would you have any other idea? I’m running out of time. I would consider to return to my strong pair RPI+Coral TPU in order to get my applicative problem solved. I’m not willing to spend $$$ to a Nvidea productive board just for the sake of making the system RO. And the time we have spent here is also gone w/o result.

So if you think there is still a solution in sight, then PM me and I will send you a Jetson Nano dev kit for your tests.

Otherwise it is end of lane here for me

Regards

I can’t guarantee this being fast enough. I’m not really in a position to take on work either.

Based on what you have above, the module loads. The issue at this point seems to be that the kernel command line is for “rw” filesystem instead of “ro”. There are other configuration changes probably needed for OverlayFS, but having “rw” instead of “ro” is a configuration issue which might be solved with changes to extlinux.conf and/or the initrd with simple edits, e.g., in extlinux.conf change the “rw” to “ro”. In the initrd there is a file, init, and this line at the end might need change (or it might not need change):

# Comment out:
#unset readonly

Sorry, accidentally posted prior to finishing. Editing…

The issue I need to research is that the argument “root=/dev/mmcblk0p1” might need to change to the specification of the OverlayFS RAM overlay. Will reply on that in a few minutes.

Do i have to set ro in both entries?

Just some notes on what I’m looking at so this can be tracked by anyone reading. This is not “everything needed”, and I believe this will not be a quick answer. I am basically researching and then using you to test the research result, so I have no doubt this will be stressful and time-consuming.

Before you read all of this, it is possible that if you save your new initrd-4.9.201-fixed-tegra.img somewhere safe, run the script you found earlier, and then drop your initrd-4.9.201-fixed-tegra.img back in place, things might “just work”. You’d also have to make sure your extlinux.confINITRD” line still points at your initrd. It is certainly the easiest thing to test. Beyond that, what follows is more or less just a set of incomplete notes. What to do with those notes depends on the result of your new initrd-4.9.201-fixed-tegra.img in combination with the script you tried earlier. FYI, you should probably also save a safe copy of “/boot/Image” prior to running that script, perhaps copying it to “/boot/Image-original”, and in your extlinux.conf, changing any listing of "Image for “Image-original”…that way it won’t matter if the script overwrites the kernel. You could at any time edit extlinux.conf and have it pick the old or new kernel (the file “Image” is the kernel). Let me know how that works, then read below if you have some patience.

Also, perhaps editing the initrd the script produces to add the original Nano’s modules and the overlay module into the script-generated initrd might be enough. Case 1, test with the non-script-generated initrd-4.9.201-fixed-tegra.img and kernel Image; case 2, test with the script-generated initrd after editing that initrd to include the missing modules.

Now for the part I’d wait to read only if the above fails…

First, I doubt that there are any other changes required for the initrd since it is loading the overlay module now (verify on the “failed” boot to the new overlayfs attempt that “lsmod” shows the overlay module…if so, then probably no other OverlayFS changes are needed in the initrd, but no guarantee.

There is a line in the initrd which could be edited to provide better logging information. Near the top of the initrd file “init” there is this: “export debug=”. This could be edited with “export debug=y” for more debug logging within the initrd.

Regarding the scripts you mentioned earlier, I suspect the failures are in the part where it is creating the initial ramdisk, while the rest of the script is “mostly” ok. These scripts are probably out of date with respect to current L4T release boot content within the initrd on the SD card model of Nano. Being sure would take some significant time to research it, but other non-initrd steps can be performed first to avoid such effort and it might “just work” after non-initrd changes.

What I do wish I had right now is the output “cat /proc/cmdline” on any Jetson of any release version which has a working OverlayFS. This would be an enormous speed boost for research. Does anyone here have a “/proc/cmdline” from a Jetson with OverlayFS? Or even from a PC with OverlayFS?

For reference, here is your current /proc/cmdline:

tegraid=21.1.2.0.0 ddr_die=4096M@2048M section=512M memtype=0 vpr_resize usb_port_owner_info=0 lane_owner_info=0 emc_max_dvfs=0 touch_id=0@63 video=tegrafb no_console_suspend=1 console=ttyS0,115200n8 debug_uartport=lsport,4 earlyprintk=uart8250-32bit,0x70006000 maxcpus=4 usbcore.old_scheme_first=1 lp0_vec=0x1000@0xff780000 core_edp_mv=1075 core_edp_ma=4000 gpt tegra_fbmem=0x800000@0x92ca9000 is_hdmi_initialised=1  earlycon=uart8250,mmio32,0x70006000  root=/dev/mmcblk0p1 rw rootwait rootfstype=ext4 console=ttyS0,115200n8 console=tty0 fbcon=map:0 net.ifnames=0  root=/dev/mmcblk0p1 rw rootwait rootfstype=ext4 console=ttyS0,115200n8 console=tty0 fbcon=map:0 net.ifnames=0

The parts of interest (an excerpt of above):

root=/dev/mmcblk0p1 rw rootwait rootfstype=ext4 ...
 ... root=/dev/mmcblk0p1 rw rootwait rootfstype=ext4

The root= specification is listed twice, and normally only the last of the copied arguments matter. The first copy is likely from earlier boot stages, and might be accessed during initrd, so “usually don’t matter” might actually matter. It is confirmed from the boot log prior to the Linux kernel loading that read-write is first specified in the initrd. Here is an excerpt of the log prior to Linux loading:

[0004.569] boot image cmdline: root=/dev/mmcblk0p1 rw rootwait rootfstype=ext4 console=ttyS0,115200n8 console=tty0 fbcon=map:0 net.ifnames=0 
[0004.582] Updated bootarg info to DTB

At minimum the “rootfstype=ext4” needs to change to “overlay”. Other arguments need to be added equivalent to this (some setup is no doubt required for creating those directories as an overlay system):

mount -t overlay overlay -o lowerdir=/lower,upperdir=/upper,workdir=/work /merged

…but I do not have experience with OverlayFS so I’m not sure how to do that from the initrd and extlinux.conf. Perhaps it is as simple as changing this excerpt in extlinux.conf for the “rootfstype”, but I don’t know what the other changes would be (I’m guessing at part of it):

…from:
APPEND ${cbootargs} root=/dev/mmcblk0p1 rw rootwait rootfstype=ext4
…to:
APPEND ${cbootargs} root=/merged rw rootwait rootfstype=overlay
(some setup would have had to have first created “/merged” as read-only ext4 under the ramdisk overlay…at this point I have no idea what that is and need to research, but those changes might need to be at least partly inside the initrd)

There would also need to be an edit to “/etc/fstab”. Again, since I do not have OverlayFS experience, I would have to research it, but it appears to need something like this:
overlay /merged overlay noauto,x-systemd.automount,lowerdir=/lower,upperdir=/upper,workdir=/work 0 0

Before you read all of this, it is possible that if you save your new initrd-4.9.201-fixed-tegra.img somewhere safe, run the script you found earlier, and then drop your initrd-4.9.201-fixed-tegra.img back in place, things might “just work”

What do you mean?

I can’t guarantee this being fast enough. I’m not really in a position to take on work either.

You are right, but I offered you an assignment several times.

I’m not quite sure what you are expecting me to do now. You are always very verbose in your post and most of it is white noise to me.

So sorry for being too dumb, but I can’t follow you anymore. And I don’t think it should necessarily be that complicated at all.

Regards

Earlier you had mentioned this GitHub URL:
https://gist.github.com/neilyoung/5a680052dbdce6258183754031c023e9

That URL had some steps for installing to a Nano, though it is out of date. If you were to follow this, then put in place the kernel modules with that the way you did with the previous initrd, then it might “just work”. Or if that won’t boot, then take the new initrd we just made, and replace the one generated by that script, then this might “just work”. This is one of the few things to test which won’t be frustrating with the amount of time and effort. If that fails, then it is time to move on to some steps which will take time and will probably be frustrating to follow. I’m quite willing to try to help with those steps, but it will be slow and lots of time involved.

As far as contract work goes, I am not in a position to take on work. Even if I did so, then there is no guarantee it would be any faster. Since I’m not doing this for pay I tend to try to teach how to do things. Simply adding an answer without trying to teach how to arrive there implies the question would recur by many people over time, whereas teaching how to get this working via testing and research means future versions of the question (even after releases change) would be avoided. Simply fixing things production style isn’t much fun and has a tendency to never stop.

. Earlier you had mentioned this GitHub URL:
An (incomplete) attempt to make the Jetson Nano rootfs on SD read only · GitHub

No, I guess you mean something different, like the “lehni-script” or so. This gist was an attempt to do something similar to what I have done on the PI (before I recovered, how much easier it is with overlayfs meanwhile). It follows the idea to prevent any write request to the SD by redirecting any writes to tmpfs and replacing services, which are known to be bitchy with these redirects by others. EDIT: And making the /etc/fstab entry for the SD ro

This worked on the PI but not on the Nano, since an Nvidia specific script failed. I didn’t dive too deep into this, but there where very concerning log entries (like: creation of symlink fails, file system read-only). I had no idea how to replace those requests and gave up this attempt.

This gist never run. It booted to some stage, but that was it.

If you mean the “lehni” script, then I have tried that several times. Basically this script is creating a new image using initramfs and enters this as INITRD. All these attempts ended up in exactly the same boot loop as now. Also there is by default already an image, which looks like an initramfs after installation. This does also not correctly boot.

Currently noticing that I have documented my progress. But I gave up with the symlink fails.

BTW: What if I would give you SSH to the Nano? Would that help?

ssh works for some things, but not for working on boot. That generally requires physical access with serial console and the USB cable with recovery mode. A lot of people have asked about OverlayFS, and maybe when I am in a better position to actually work on something like this I will, but it will likely be some time before I can do that.

1 Like

Ok. Thanks again for your efforts.