SD card damage protection

The board came with the dev kit carrier. I flashed JP45 and it definitely booted from SD. I’m just confused about what you are saying (that the boot device is eMMC), since I never edited that extlinux.conf… Anyway, I will need yet another week to replace the (again!) damaged board.

I will let you know (if I don’t abandon all this earlier, but then I will let you know too :))

That is just an illustration. If you have an extlinux.conf pointing at a given device, and you’ve changed boot device in some way but not changed the extlinux.conf, then you won’t be using your changes. The boot device for a read-only system is likely in need of changes to extlinux.conf. That specification may require edits in any extlinux.conf involved, and might also include a specification of mount options in the initrd as well. Make sure that what you see from “cat /proc/cmdline” is what you actually expected (“cmdline” is a reflection of what the running kernel thinks its startup options were). If something does not match, then start looking at all versions of extlinux.conf.

Ah, yes. Thanks for the detailed explanation. Well, a replacement board is on the way again :) Let’s see, how long that will last then :))

@linuxdev: Ready to resume? In the meantime while waiting for the next Nano replacement I was follwoing this little repo GitHub - lehni/root-ro: Read-only root filesystem for Raspbian Stretch and Jetson Nano (using OverlayFS) which perfectly worked on my RPI3. After running the script the overlay fs was setup, all writes went to RAM. There is also a nano setup script, which doesn’t work for me. Possible to help me make it work?

Up with the Nano and a fresh image installed. Waiting for your suggestions.

I don’t know what the issue is, I have not tried this myself. There is however a likelihood that any Nano setup script will depend on whether it is the eMMC version of Nano, or if instead it is an SD card version of the Nano. Device trees will differ.

It is only a small probability that the revision of the Nano would also be involved (there can be more than one hardware revision within a single model).

One more possibility is that some of the more recent releases may use the QSPI memory for boot content, but older releases could instead put that same information in eMMC or SD card. Which release did you try on the Nano which failed? If it is the latest in the R32.5 series, then you might want to try the slightly older R32.4 series (or R32.3 series) to find out if this is due to a change in how booting is arranged. To see your current L4T release use “head -n 1 /etc/nv_tegra_release”.

Once you’ve tried that, then you would probably want to show a full serial console boot log.

neil@jetson:~$ head -n 1 /etc/nv_tegra_release
# R32 (release), REVISION: 5.0, GCID: 25531747, BOARD: t210ref, EABI: aarch64, DATE: Fri Jan 15 22:55:35 UTC 2021

This is what dmesg says regarding the boot device.

I wouldn’t know where to fined the older R32.x versions. I tried with a last year image containing JP4.3, but this didn’t work either.

I did not have applied the “lehni” script. Here is my original bootlog.txt:

bootlog.txt (209.6 KB)

I will now go and apply the RO script.

EDIT: Output after sudo ./install-nano.sh

neil@jetson:~/root-ro$ sudo ./install-nano.sh
dphys-swapfile is not installed, assuming we dont need to disable swap
Setting up maintenance scripts in /root...
Setting up initramfs-tools scripts...
Adding "overlay" to /etc/initramfs-tools/modules
Updating initramfs...
update-initramfs: Generating /boot/initrd.img-4.9.201-tegra
Warning: couldn't identify filesystem type for fsck hook, ignoring.
I: The initramfs will attempt to resume from /dev/zram3
I: (UUID=1b6d8b6e-3268-4ca0-bf49-40cb034d0e61)
I: Set the RESUME variable to override this.
/sbin/ldconfig.real: Warning: ignoring configuration file that cannot be opened: /etc/ld.so.conf.d/aarch64-linux-gnu_EGL.conf: No such file or directory
/sbin/ldconfig.real: Warning: ignoring configuration file that cannot be opened: /etc/ld.so.conf.d/aarch64-linux-gnu_GL.conf: No such file or directory
Changing INITRD in /boot/extlinux/extlinux.conf
Removing the random seed file
Please restart your Jetson Nano now to boot into read-only mode
neil@jetson:~/root-ro$

The suggested restart gives:

neil@jetson:~/root-ro$ sudo /root/reboot-ro
The root filesystem is already in readonly mode.

After the reboot the system does not return to desktop and reboots permanently.

Consider an actual serial console boot log since these can be searched in a text editor and what occurs earlier in boot might matter.

The release list for L4T is here:
https://developer.nvidia.com/linux-tegra

The release list for JetPack/SDKM is here:
https://developer.nvidia.com/embedded/jetpack-archive

Do know that you don’t have to use JetPack/SDKM if just flashing. If for some reason you have issues with getting a version of L4T you want via JetPack, then instead download the “driver package” and “sample rootfs”. This is what makes up the content here:
~/nvidia/nvidia_sdk/JetPack...version.../Linux_for_Tegra/

When you unpack the driver package, do so as a regular user (don’t use sudo). This produces subdirectory “Linux_for_Tegra/”.

Within “Linux_for_Tegra/” is subdirectory “rootfs/”. The content of the “sample root filesystem” is what initially populates that subdirectory. You’d need to unpack this into “rootfs/” as root (do use sudo).

Then, from “Linux_for_Tegra/”, you can complete the preparation of “rootfs/” by running this command:
sudo ./apply_binaries.sh

What this does is take a purely Ubuntu 18.04 in “rootfs/” and add NVIDIA-specific binaries into that directory. This and all above only needs to be done once. From there flashing on command line is available using that version of driver package plus sample rootfs. This is what JetPack/SDKM would download and unpack on first use, but having this content somewhere without JetPack/SDKM can be useful. This is also the most simplified way to flash.

Once that is in place, and the Jetson is in recovery mode (I’m assuming eMMC model, there would probably be some differences for SD card model which you’d have to adjust for) with the micro-B USB connected, then this should flash:
sudo ./flash.sh jetson-nano-emmc mmcblk0p1
(this puts a slight bit of content into the “rootfs/” in the “rootfs/boot/” directory based on boot requirements of the named Jetson model)

You’d be able to complete first boot setup after that. Consider having a monitor/keyboard on the Jetson during flash.

If your content is valuable, then consider clones of each release you want to preserve. Note that you can use dd to clone an SD card on the host PC, or for an eMMC model, you can clone using flash.sh. Either way, even with the rootfs cloned, you might need to flash even the SD card model (using that rootfs) rather than simply replacing the SD card (this is due to newer releases perhaps having different boot content in the QSPI memory and not in the SD card).

Well, I really appreciate what you are doing, but you are speaking in riddles…

Consider an actual serial console boot log since these can be searched in a text editor and what occurs earlier in boot might matter.

OK… What?? I was using journalctl -b and this is what I know only.

Giving up. Sorry, this is too high for me.

Too bad. It was such a thrilling idea to have an overlay FS over the rootfs and it works on other linux systems, w/o having to be a super specialist.

Thansk for all your help.

I’m out here

Serial console is basically a keyboard connection without any need for networking, without any need for a monitor, and without any need for much of anything…it tends to run well even with multiple parts of the system failing…it is just too simple to have many requirements, and thus failure is also rare. The part which is of interest here is not so much reliability as the fact that the GUI and Linux logs are not running during bootloader stages, but serial console does run even in the bootloader prior to Linux ever starting. The timestamps it provides are from a time prior to Linux ever starting, and thus say far more than does a Linux log.

For information on serial console:
https://www.jetsonhacks.com/2019/04/19/jetson-nano-serial-console/

Basically you connect the serial console UART end to the Nano, and the USB end to your desktop PC. You then open a serial console program on the PC and point it at the USB serial UART cable. All display and logging can be done from the PC’s serial console program (which substitutes for a monitor and keyboard). When you start logging just before booting the Jetson you’ll see the entire boot sequence (with timestamps), and log it directly to the host PC (which is convenient for posting to forums).

FYI, serial console log during boot happens to include the journalctl as well. You might need to remove any “quiet” option from boot command line, but that can easily be dealt with after you have serial console running.

Ah this, yes I have done that already, with a headless setup.

But I don’t know. The thing is, I’m focused on the ML application, not on the low level Linux/Nano/Nvidia stuff. I need to hand out my box for a POC and I know, a single unwanted power off can damage the SD. I just wanted to protect that SD against this, since my app doesn#t need to write on it. It turns out, it needs 10 trimester Linux Kernel experience to achieve this, which is done with a simple script run on a RPI.

I’m about to give up the entire Nvidia in favour of a RPI 4 plus Coral TPU. From an application standpoint I can achieve the same and I will have way less headaches regarding SD card protection.

That sucks so much… And Nvidia doesn’t care. I guess it would be easy for them to provide a gist regarding “How to install an OverlayFS on Jetson Nano”.

Well, I’m currently flashing an older version. I’m sure I have tried that also already to no avail. But the Swiss Crack, who wrote this repo, claimed it would work with JP 4.3

Anyway, I usually don’t give up, but this is a PAIN IN THE ASS. The more, that you are talking to me, as if you would think, I would just understand a 10th of that…

EDIT: No, wrong. I did that already while trying to capture a kernel crash. 2 month ago, gone…

Will try to establish the thing again, must having this somewhere over here

I will admit that OverlayFS is far from simple. Many of the low cost consumer devices have always been distributed to be simple from day one, whereas the NVIDIA devices originated as only being reference designs which a large company would use as a basis for building their own hardware and software. The original commercial origins intended for experienced engineers in factories for commercial projects was not originally intended to be plug and play the way the RPi and many of the other embedded systems were designed for. So it is a long road of understanding lower level embedded system hardware and not an easier “cookbook” style system like an RPi. To some extent it is evolving to that, but it isn’t there yet.

The two biggest issues are probably changes in how the bootloader is evolving in recent releases, versus the JP 4.3, combined with the fact that there are both SD card models and eMMC models which require different firmware…all of which are probably involved in any OverlayFS issues you are having.

When I do respond I don’t know what knowledge level someone is at, but I will try to answer questions. I can’t do much about the frustration with the increased learning curve for Jetsons versus consumer oriented systems other than answering as much as I can.

Consider me to be the last Linux dumbass, just capable of opening a terminal and type in some things.

OK, I flashed a R32 and this is the serial log for an unchanged start. Will now apply the scripts for the overlayfs and post the serial log then.

out.txt (22.6 KB)

Good. Applied the “lehni” scripts. System rebooted (boot log attached). Touched a file. Rebooted. Found the file again. Nice ReadOnly system, indeed…:(

T H I S D O E S N O T W O R K

out.txt (80.9 KB)

OK, I’ll leave Nvidia alone with their arrogant attitude to do things for specialist only. I guess the last company who did that successfully was Nokia. Going with RPI instead.

Thanks again.

Good bye

I think a better way to keep of the idiots is to make the box $ 3000. And to not wilder in the price segment of an RPI4 plus addons.

End of message :) Thanks anyway. Not your fault.

If someone of the knowing finally will come up with a receipt I consider to return to the arrogant

@dusty_nv Thanks for editing. I would like to sink the entire thread, if I could. It’s a painful demonstration of how powerless you can be when you have no idea and can’t find help or a solution.

For me it feels like “base camp 6” 100 m below the summit. I have to return, while my app is ready and works. I can’t protect the SD and I also will not go for $$$ for any eMMC board just for a POC. It is a real mess.

If anybody is following this: To change to mmcblk1p1 is not the right choice. mmcblk0p1 is correct. Otherwise no boot

Even if you work on something else (like the RPi) it might be worthwhile to continue on this particular project as well, so will try to answer more questions. I’m not sitting there to experiment, so it is more difficult to know what to do next without some possibly (probably?) confusing explanations.

In your file “/boot/extlinux/extlinux.conf”, do you see the word “quiet” as part of the “APPEND” line? If so, then remove the word “quiet”. Any editor will work. To see what is in the file without opening an editor you can “cat /boot/extlinux/extlinux.conf”. When editing your editor must be started as user root, which is done by prefixing “sudo” to the editor start command.

Regarding this:

…does this mean that after modifying that file you rebooted again and the change persisted? Changes should revert and go away after a reboot. However, you said “T H I S D O E S N O T W O R K”, so it sounds like the changes were permanent.

The SD card model uses “/dev/mmcblk0” since it doesn’t have eMMC and enumeration starts at 0. If this were an eMMC model, then the SD would be bumped to “/dev/mmcblk1”. It can be hard to always name the correct one when replying to many posts where people might have either model of Nano, so beware the “1” or “0” might be off in posts speaking of a different model. Summary:

  • For SD card models use “/dev/mmcblk0”,
  • for eMMC models the SD card is instead “/dev/mmcblk1”.*.
  • The first partition appends “p1” to the mmcblk#.

@linuxdev

Even if you work on something else (like the RPi) it might be worthwhile to continue on this particular project as well, so will try to answer more questions. I’m not sitting there to experiment, so it is more difficult to know what to do next without some possibly (probably?) confusing explanations.

I really admire you and appreciate your patience. I will try to resume, because I really think, it is worth it. It is just so close and I can’t find the reason for that.

In your file “ /boot/extlinux/extlinux.conf ”, do you see the word “ quiet ” as part of the “ APPEND ” line? If so, then remove the word “ quiet ”.

I admit I forgot to do that before. But I know how to do that and I will do next time

…does this mean that after modifying that file you rebooted again and the change persisted? Changes should revert and go away after a reboot. However, you said “T H I S D O E S N O T W O R K”, so it sounds like the changes were permanent.

Well, I did “touch test” and checked, that the file was there. Then I rebooted and I expected to see the file no longer, but it still was there. This in turn did show me, that the file must reside on my SD and not - as expected - in RAM.

  • For SD card models use “ /dev/mmcblk0 ”,

Yepp. Confirmed.

OK, how to proceed? I would suggest that I do these steps in a row:

  1. Flash a new image from the Nvidia site, basically the latest available. Bring that up, configure it, have the console logger aside

  2. Once this is up, edit /etc/boot/extlinux/extlinux.conf, copy the current state (for documentation) and remove “quiet”.

  3. Reboot and completely log the boot process via serial

  4. Apply the “lehni” repo. This basically does exactly, what is required to create a overlayfs: It creates a new “initramfs” and enters this as boot image. I would before boot check, if “quiet” removed is still the case (I would expect this to be the case).

  5. Then reboot and log the boot process. In case the system comes back I would make the “touch test” check again.

I could provide the boot log for both cases.

Would that be a good plan to follow?

Thanks again. You are pure motivation.

Yes. I also recommend just run updates to get them out of the way before doing more. If an update messes up something it is a lot of work for nothing. If updates are all in and the system is working normally, then you can clone the SD card and avoid all that work by restoring from the clone instead of doing all those steps again. From the host PC, be sure you have a lot of spare disk space (e.g., via “df -H”), and assuming on the host PC the SD shows as “/dev/sdb” (that’s just an example, it could be something else…if you monitor “dmesg --follow” during SD card plugin it will confirm what it actually is):

  1. Make sure the SD card is not mounted, e.g.,
    sudo umount /dev/sdb
  2. Use dd to clone:
    `sudo dd if=/dev/sdb of=sdcard_clone.img bs=1M
    (note that “bs=1M” has no effect other than using more buffer to help speed it up)
  3. After the clone is done, then you could restore to the same SD card after it is messed up and skip all those previous install steps and be ready (assumes SD card is not mounted, and that you first cd to the directory with the clone):
    sudo if=./sdcard_clone.img of=/dev/sdb bs=1M; sync

As long as the QSPI boot memory is not altered this will have you fully back up to an updated system with account setup and other settings without any more work. Then whatever you test can fail and you won’t have so much to worry about. If you reach a milestone and something gets further along, then you can clone again, and any restore will save that progress.

Rebooting and saving a serial console log is always a good idea if you might need to post something.

I don’t know about the “lehni” repo since I have not used it. I suspect that if it has a weakness, then it is from a need for different content in the initrd (initramfs) due to either the Jetson model’s software, or some change in the L4T release version (there is different boot content depending on L4T release even when the hardware is the same).

Incidentally, one of the biggest reasons for a boot log is to see what the initrd did. In theory, if the initrd echos what it does as it runs, then we’ll see information about how modules were loaded there, and the result. This is a good plan.