Recommended way to duplicate emmc rootfs (APP partition) into internal ssd (nvme) without dd

linuxdev · September 14, 2022, 4:04pm

It depends on how you use system.img.raw as to whether or not it works. By itself you cannot just copy system.img.raw as a file. You have to actually press the bits from inside of system.img.raw directly into the partition (not as a file, but as binary data), and although scp can get this file to the Jetson, your use of rsync is for individual files, not binary data.

This is something of an ad hoc comment (not really needed unless you are interested in what went wrong) without a particular order, but:

If you have a binary image (such as system.img.raw), then you must use dd to apply this to a partition. dd works with binary data and the filesystem is unknown and irrelevant to dd.
rsync requires a filesystem and looks at individual files and directory structure, so rsync has no point in working with system.img.raw (which is binary data on the outside, and includes a filesystem in its bits on the inside where rsync does not look).
Sizes of files might differ with tool shown. Something like imperial/metric unit conversion, in some places you might see “kB” or “MB”, which are base 10 multiples of 1,000 or 1,000^2; others use “KiB” or “MiB”, kibibytes or mebibytes, which are base 2 multiples of 1024 or 1024^2. Running out of space might be from creating a partition based on a base 10 abbreviation destination partition, versus a base 2 abbreviation of the size of files. Mixing units can result in unexpected filled space.
Partition sizes are not the amount destination space available unless you speak of binary data. Filesystems within a partition have overhead, and if you were to format a partition as ext4, then the available space (filesystem empty space) is less than the partition size (binary data size). Assuming a partition and the filesystem are the same size can result in unexpected filled space because of overhead.
Binary data has no apparent structure (something using the data would have to “know” what is in the data). On the outside “system.img.raw” is binary data.
Filesystems are a data structure (a tree of nodes and metadata to understand the tree). On the inside “system.img.raw” has ext4 tree structure.
system.img.raw, as a file, is binary data. dd is the tool for this. The output location of the dd command becomes structured ext4 data.
system.img.raw, once its contents are extracted and overwritten on a partition, is structured ext4 data. The operating system driver is the tool that understands a filesystem like ext4, which in turn uses tools like cp, scp or rsync to manipulate that data.

Since you have “system.img.raw” as a file (it was copied via scp) in “APP_b”, I’ll give an example of one way to use this, but there are others. I will assume “APP_b” (nvme0n1p2) is mounted (letting the driver treat it as ext4) at “/mnt/app_b” (adjust for your case). I’ll assume that “APP” (nvme0n1p1) is mounted at “/mnt/app_a” (adjust for your case).

You could create APP via:

# You don't want to write to APP as a partition (rather than filesystem) with `dd` while
# it is mounted:
sudo umount /mnt/app_b

# While mounted APP_b is structred ext4 data. When accessing nvme0n1p2 directly it is binary data.

# Note that "block size", the "bs", is irrelevant other than being a bigger buffer for faster
# operation, and could be omitted:
sudo dd if=/mnt/app_b/system.img.raw of=/dev/nvme0n1p1 bs=10M

The above would overwrite all content on nvme0n1p1, but because the underlying data is a self-contained ext4 filesystem, this means you’ve simultaneously formatted nvme0n1p1 as ext4 and also installed all files. You could then:
sudo mount /dev/nvme0n1p1 /mnt/app_a
(adjust for wherever you want to mount this)

At the moment APP_b (mounted on “/mnt/app_b”) is just a file and not extracted. You could now do this to mirror APP to APP_b:

# We are using rsync, so both source and destination must be mounted and treated as files.
# Keep in mind there are a lot of rsync options, this is just a kind of "blanket" case, but we have to start empty:
sudo rm /mnt/app_b/system.img.raw
sudo rsync -avcrltxAP --info=progress2,stats2 --delete-before --numeric-ids --exclude '.gvfs' --exclude 'lost+found' /mnt/app_a/* /mnt/app_b

One can actually use dd to read on one Linux system, but write to another over the network (you could have skipped scp to APP_b, and skipped use of rsync). Half of dd can run on the host PC (reading), and the other half can run on the destination (the Jetson), and file copy of system.img.raw can be skipped (dd is binary image copy). If not for permission issues with remote login of user root on Ubuntu systems this would be simple. I hate to mention this for your specific case, but if you want more comments on this, then see this article:
https://nfolamp.wordpress.com/2010/06/14/performing-backups-with-netcat/

The gist of the above URL says you could do a remote dd directly from host to Jetson’s APP_b and APP. This is only true if you can get past the confusion of sudo combined with ssh or netcat.

Note that if dd of APP_b system.img.raw to APP fails, then it means the partition of /dev/nvme0n1p1 is too small. Likely because of the use of base 10 or base 2 units. However, this might still succeed. Whereas the use of rsync deals with files and overhead on ext4, the use of dd has no overhead and bypasses ext4 (though the underlying data creates ext4, the data itself, as written, is just binary…it isn’t until you mount the ext4 that overhead matters).

Almost forgot: cpio understands serial transfer of data, but is not so different from rsync. It is not the fault of rsync that your copy failed. Nor would cpio do better. However, think of cpio as the underlying tool within rsync. cpio is simplified rsync with fewer bells and whistles.

BSP_User · September 15, 2022, 5:33am

Thank you for your detailed answer. I save it for future reference.
What about a tool or method to determine if the rootfs duplicated well and not with some corruptions?
I think it’s important to verify before you remove your original rootfs and use the new one.

linuxdev · September 15, 2022, 3:03pm

If the rootfs has not been mounted (or always was read-only so no metadata will change), then you can use a checksum of the rootfs (thus no running system can ever have the image as a whole verified as an exact match). However, any kind of mount or operating which might change metadata means you need to verify for every file. This is also one of the reasons why “rsync” exists: It uses checksums (if the option is given) to determine if individual files have changed before it updates the destination.

Incidentally, a true clone will always be a perfect match. Cloning is done in recovery mode for eMMC models, and the rootfs is not mounted (the rootfs is thus available for read, but not write during a clone).

Consider that a clone is a 100% copy of everything every time a clone is performed. “rsync” can do this, but usually this is only the first time it is run. After rsync has run once, then it will iterate through the entire backup source on the next run, and compare each item to the destination on the second or further run, and update only the items which have changed. rsync has various methods of determining change, and options can pick. One method is by timestamp of the file, which is very fast but not confidence inspiring; the other method is via checksums.

If you were to clone to a PC, and loopback mount the clone on the PC, you could then run “rsync” from a running Jetson to the loopback PC clone (a hybrid where first run is via clone, and update is via “rsync”). This is a common practice on all of Linux (other than first run via clone in loopback).

Note: There are other partitions beyond the rootfs. This won’t do anything with those partitions.

system · October 12, 2022, 2:07am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.