System image is still very large after mksparse

Hi, I have got my image file using

./ -r -k APP -G clone.img jetson-tx2-4GB mmcblk0p1
ls -lh clone.*
-rw-r–r-- 1 root root 14G Apr 3 22:06 clone.img.raw
-rw-r–r-- 1 root root 12G Apr 3 22:18 clone.img

If I mount clone.img.raw as loopback device, I can see the total used space of the mounted partition is about 8.3GB
If clone.img is output from command ‘mksparse -v --fillpattern=0 clone.img.raw clone.img’ ?
If so, why clone.img is still so large ?

This is not intended to give an exact answer, and may be technically incorrect in some ways, but is intended to give a more intuitive description of “empty space” and what is reduced in size during the creation of a sparse image…

The raw image is the literal total byte size of the partition. Not all of this can be used as storage since there is also metadata, e.g., there is a minimum of one node (it is a tree structure) used even for a program of zero bytes. Total available storage will be something less than the full image size even with an empty image.

The sparse image is basically the same thing, but instead of putting in the literal empty nodes there is a note of how many empty bytes, and then the program using the image inserts that many empty bytes. If all empty space is contiguous, then in theory it would take only a single metadata node to say how many bytes of empty space to put in. If the empty space is fragmented in different locations, then you will probably get more metadata used to describe the empty nodes. The compression depends on how the empty space is arranged.

On a related note, if your filesystem has a given block size, and every file on the filesystem is able to fit exactly in that block size, then there will be an exact one-to-one mapping between i-node and/or block count (I’m not trying to be exact) and data (the i-node is the metadata node, the index node).

If your files exceed the default node size, then at least two i-nodes are required (and even more as the file size exceeds the block size multiple…nodes are a linked list with the first node pointing to the second node if the first node is insufficient). If for some reason you have a lot of files of zero bytes, then your reserved i-nodes will be used up even though your filesystem may be storing only some tiny fraction of its full size…i-nodes can fill before the data consumes the disk.

If your file is one enormous file which consumes the entire size of the partition, then the file won’t fit. You will have some of that space consumed by the reserved i-nodes. The full space will not be available due to reserved metadata even though it isn’t used. One can set the block size of an ext4 filesystem during mkfs.ext4. This changes how many individual files can be added.

If on average you have one node reserved for every 4k of filesystem, and if your file sizes average 4k in size, then you have a perfect match and the filesystem will be the most efficient. There will be no wasted space reserved for i-nodes and every i-node will have a relation to a file.

To see your block size and related information for a specific partition, e.g., on the Jetson the rootfs mmcblk0p1:
tune2fs -l /dev/mmcblk0p1
…and note “Inode count” and “Block count”. This can be tuned during creation of the ext4 filesystem, but has defaults.

If you cover your raw image with loopback, then you can run tune2fs against this to see the node and block counts…this will be an exact match to running directly on the Jetson the clone came from.

Want to see how well nodes (and thus too file size) matches node counts? Try this on the running system (optionally use “-H” to see “human units”):
df -T -i /

Compare how percent space is filled with the “-i” option versus without the “-i”, and you’ll see how full the disk is from two separate perspectives, and it is unlikely they will match:

df -H -T -i /
df -H -T /

When a sparse file is created both i-nodes and actual data change how much empty space can be cut out for the sparse file.

Finally I got the smaller image:

losetup --find --show clone.img.raw # assume loop device is /dev/loop6
zerofree -v /dev/loop6 # That’s it!
losetup -d /dev/loop6
./bootloader/mksparse --fillpattern=0 clone.img.raw clone.img
ls -lh clone.*
-rwxr-xr-x 1 root root 8.2G Apr 4 21:21 clone.img
-rw-r–r-- 1 root root 14G Apr 4 21:19 clone.img.raw

I think either the unused space is not set to 0 on EMMC, or did wrong when cloning.
Thank you anyway, @linuxdev .