PXE boot rootfs over network

Hello all. We are attempting to PXE boot our system and have so far been able to boot our custom kernel into a customized initramfs image. The init script has been changed to bring up the network stack, at which point it downloads a minimal rootfs image (in the form of a squashfs) over the network using ‘wget’.

I have determined that:

  1. the rootfs is downloading correctly (the checksum is correct)
  2. I can mount the squashfs on a normally booted jetson image using mount -t squashfs /mnt/rootfs and the contents have the correct permissions
  3. The kernel supports squashfs and all of the other file types (determined using cat /proc/filesystems
  4. The compressed rootfs is ~400 MB. The extracted size is ~800 MB. The Orin NX has 8 GB of RAM, 4 of which should be accessible to initramfs, unless I am mistaken.

However, I am seeing that the squashfs fails to mount from the init script. There is no error message, even if I try to pipe stdout to /dev/kmsg. I just get an errno of “1”. Here is the basic code in ‘init’:

busybox wget -O ${rootfs_file} ${rootfs_url}; mkdir -p /mnt/rootfs; mount -t squashfs -o loop,ro ${rootfs_file} /mnt/rootfs

The mount fails with error code 1.

I have also tried to unsquashfs the contents, which does work (the contents are extracted into /squashfs-root, but in this case I cannot switch the mount point to this folder.

What is making this more difficult is that even though the init script tries to exec bash to drop to a command line, the command line does not work in this environment. Additionally, the commands don’t seem to print usable error messages, even when redirected to /dev/kmsg.

Hi rstubbs,

Are you using the devkit or custom board for Orin NX?
What’s the Jetpack version in use?

We’ve not verified using squashfs on Jetson device.
Could you share the detailed reproduce steps for us to verify it locally?

This is on a custom carrier, using Jetpack 5.12

I have modified init to download the squashfs image over the network (described above), but I think it would also work to just put the squashfs image directly into the initramfs image. Extract the initramfs image locally, insert a squashfs image into some folder like /tmp, then recompress the contents into a new initramfs image.

Then, edit the init script to try to mount the squashfs version of the rootfs as shown above. The usual mount point for other boot types is to /mnt, so a command like

mount -t squashfs -o loop /tmp/rootfs.squashfs /mnt

should get the same result.

I am supplying the kernel with options through the PXE boot process, which are:

root=/dev/ram0

I’m a long way from being able to answer, but I’ll suggest that every piece of that has to be available to the kernel which is doing the mount. You might consider modifying the init script to “exercise” this in pieces.

An example is to exercise loopback. You have the uncompressed squashfs file I assume (if not, then uncompressing manually would be an “exercise”), and you could use just the “losetup” command to cover the file with loopback, and to not try to mount in one step. If you can cover with loopback (losetup), then you can try to manually mount and purposely also umount the loopback device. If that succeeds, then mount that loopback device again. Then you would try to mount the loopback device instead of mounting the file with a command that would combine loopback (you’re breaking apart the steps to their minimum tool because you’re interested in unoptimized tiny steps to find the broken step).

Normally, many of those commands combine more than one step even if it is just one command. The more you break it down into tiny pieces the sooner you will narrow in on the exact and specific cause. If it turns out covering with loopback is an issue, then you have a good start; if it turns out the mount of the loopback is what fails, then you’ve narrowed this down to something more specific and can drill down on that.

Thank you for the reply, linuxdev.

I had added the libraries needed for unsquashfs and losetup in earlier experiments, to get them to run.

I indeed tried to associate a loopback with the file in an earlier exercise. losetup fails, but again, these commands don’t output any error messages so it is difficult to determine what is failing specifically.

I tried the following:
mknod /dev/loop0 b 7 0
mknod /dev/loop1 b 7 1
chmod 660 /dev/loop*
chown root:disk /dev/loop* # This fails
ls -liah /dev/loop0 # prints brw-rw---- 1 0 0 7, 0 <date> /dev/loop0
ls -liah /dev/loop1 # prints brw-rw---- 1 0 0 7, 1 <date> /dev/loop1
losetup -a > /dev/kmsg # dump all loopbacks… no output
losetup -f -r ${rootfs_file} # Fails with errno “1”

Is the loopback support by means of a module (it sounds like it is)? Then before running this, can your init print the output of lsmod? There isn’t even any need to consider squashfs before loopback covers “/tmp/rootfs.squashfs”. To use that module it should be present in both the boot content (the initrd) and in the filesystem of the fully booted system (whatever the pivot root mounts as the final filesystem after it is done with any initrd).

Also, never try to chmod a loopback device. Partly this is because it is a driver and not a file, and because everything in init is already run as root.

Regarding losetup, from the man page about return values:

RETURN VALUE
       losetup returns 0 on success, nonzero on failure.  When losetup displays the status of a loop device, it returns 1 if the device is not configured and 2 if an error  oc‐
       curred which prevented determining the status of the device.

In your script perhaps also add an “ls -l ${rootfs_file}” just before the losetup attempt.

That’s what was tripping me up. At some point makemenuconfig was run and removed built-in loop block support, while retaining squash support. I also had to go through and make sure all dependencies were covered, which was a long process.

For those of you who want to boot a ‘live’ version of jetson by serving the kernel, initramfs and rootfs image over the network and running the whole kit directly from RAM, do the following:

  1. Extract the Jetson initramfs image: sudo unmkinitramfs <rootfs>/boot/initrd.img /tmp/initrd/
  2. Edit the init script in /tmp/initrd/init to do the following:
    # After squashfs is in the initramfs tree:
    modprobe -v loop # load the loop module if it isn’t built in
    kmod list > /dev/kmsg # list all loaded modules
    mount -t squashfs -o loop ${squashfs_file} ${mount_point}
    exec chroot ${mount_point} /sbin/init 2
  3. Add dependencies into your initramfs as needed (in my case, loop.ko and several shared objects needed for zipping and extraction). Dump the dependencies of each binary you add with objdump -p <bin name> | grep NEEDED. Then make sure they are present with find . -name <lib name>
  4. Re-pack the initramfs: cd /tmp/initrd/; sudo find . | cpio -H newc -o | gzip -9 > <initramfs image file path>
  5. Serve the kernel and initramfs image from your PXE boot server of choice
  6. The kernel will load the initramfs, run the custom init script, load the correct dependencies, then optionally fetch the squash file from the server, mount it, and transfer the init process to PID 1 after chroot switches to the new root point.
  7. Build and modify the rootfs of your choice from the tools/sample_fs directory, or better yet, build one using meta-tegra if you need to actually know what’s in your rootfs…

Many thanks.

2 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.