Jetson Xavier AGX not booting after fstab modification

I ran out of space on the extraordinarily small 32GB storage on the AGX and attempted to move /usr to an SD card. I made modifications in fstab, rebooted, and now it won’t start. I am aware I performed this operation incorrectly, as the boot intializes on the eMMC/SD card and rootfs must be installed on the SD.

The problem is now – there does not appear to be an option (as on all other Linux systems in existence) to access Grub or SSH into the device to reverse the fstab changes. Before I go on a tirade and ask why there are no backdoors to this device in such circumstances, I’d like to hear some advice on how to fix this problem, as I have over 2 weeks of work on this machine and I’d hate to lose it all due to not having access to fstab.

Any advice you could provide would be greatly appreciated. This unit has been incredibly frustrating to work with, as it has not been built with CUDA in mind (i.e. none of the most popular CUDA-enabled libraries work with it out of the box without significant hacking and modification). For example, even the OpenCV that is installed via SDKManager is not CUDA-enabled, but that’s a discussion for another thread.

Is this the NX, or is this the full Xavier? This is the NX forum, and much is in common between the NX and the full Xavier (the AGX), but confirming the specific model is important (perhaps it is an eMMC version of the NX?).

Assuming this is a model of either NX or AGX with eMMC, was your edit to fstab within the eMMC filesystem? Or was the edit to the SD card fstab?

FYI, GRUB is only for PC architecture and requires a standard BIOS. In embedded typically this would instead be custom partitions performing the equivalent of the BIOS, and then handing over to the U-Boot bootloader (or in some cases CBoot emulating U-Boot function).

The correct “back door” would be serial console (but this is just a text terminal, and if the system fails to boot, then you might be out of luck). See this if you are using the NX (which has the same serial console as the Nano):

However, the ability to rescue from an embedded system’s U-Boot (or CBoot) is far more limited than with GRUB. Mostly what you would get is an opportunity to boot a different kernel or device tree, if and only if you set those up ahead of time. The ability to edit the fstab is unlikely.

Depending on platform it is possible you could create an SD card rescue image, mount the eMMC image, and then edit the fstab to repair it.

The most reliable method of dealing with this is also the longest in terms of time, and has some learning curve to it. This would be to clone the rootfs to the PC (meaning you need a lot of spare disk space on the PC), edit the loopback mounted clone’s fstab, and then flash the edited clone back onto the Jetson. Despite the requirements of time and disk space this is highly recommended. The reason is that this provides a backup of your work. At later dates you can even perform an rsync update of the loopback mounted clone to keep it up to date and avoid a second clone should you want to update a lot.

I had a feeling this would be the answer. It has been my unfortunate experience with Jetson, in general, up to this point. I am using the full Xavier (latest AGX). I attempted to serial in with gtxterm, and I see output from the device, but it does not accept any input and basically hangs at “Press Enter for maintenance of CTRL-D to continue”. Neither option does anything, even directly connect to the device with a keyboard with HDMI output.

So it sounds to me that I have lost a couple of weeks of work. This is unfortunate. Would you be able to provide some brief guidance on how to create an SD card rescue image? I tried this with a USB, and it recognizes and mounts it, but not much more than that. I never get a login prompt or any other reasonable sign of life.

FYI, for those reading, this is in the NX forum, but the platform is actually Xavier (full, not NX).

Do you know which L4T release version is on this Xavier?

Can you provide any more details about the USB storage showing up? For me this is sort of trial and error, but it sounds like there might be some of possible benefit to further testing of this.

Regarding SD card, I’m not sure if I can help with this. A bit of explanation can hopefully clarify why it is difficult.

Most Jetsons use U-Boot for the boot loader. All of the Jetsons have earlier boot stages, up to CBoot. Once CBoot is done most Jetsons then load U-Boot, and this is what loads the Linux kernel. However, Xavier and NX skip U-Boot. Some partial functionality is provided in CBoot to emulate U-Boot. My current Xavier is a bit out of date, and thus I’m not sure if more U-Boot compatibility has been added, but in all cases the U-Boot compatibility within CBoot would be only a partial emulation of U-Boot options.

Normally, on a model with eMMC (and all Xaviers are like this, although not all NX are), the flash provides what is more or less a pointer to where to find the “/boot/extlinux/extlinux.conf” file. However, there are a series of U-Boot macros which will normally recognize if there is an extlinux.conf on alternate media, and can then switch to that due to the macro finding this (e.g., U-Boot might normally use the eMMC “/boot” content, but have the option, when detecting other media, to use that media’s “/boot” content instead). When you specify to flash to mmcblk0p1 this becomes the device with the default pointer. This is true regardless of using CBoot or U-Boot. However, the macros used for searching for other media may or may not exist on CBoot…don’t know since my release does not allow CBoot to print environment variables. If you put what is a more or less bootable SD card in, and if CBoot searches for alternate media, then it might be possible to create the SD card rescue.

Your partial success with a USB device for storage might indicate that CBoot is in fact searching for external media to boot…don’t know, but perhaps it is so. The trouble is that the “/boot/extlinux/extlinux.conf” on the USB device would need to be edited to tell it to use the rootfs on that instead of the one on eMMC. This would be similarly true if you were using an SD card to boot.

If you examine the host PC which performed the flash, then you will find this directory:

You will also see an “almost” copy of the full filesystem flashed at “Linux_for_Tegra/rootfs/”. Prior to flash the script is used to add some NVIDIA-specific content into the rootfs/ (a one-time need). During a specific flash arguments passed to the flash script will also copy some content into the “rootfs/boot/” directory, and this contains any extlinux.conf such that the default boot is to the device you named in the flash (the pointer I mentioned earlier).

Once that content is in place, if you were to create the first partition of an SD card as ext4 type and correctly copy that rootfs/ content in to the partition, followed by editing the extlinux.conf to change the rootfs to the SD card instead of the eMMC, then you would have your rescue. Or if it is a USB storage device, then the extlinux.conf of that device would need to be edited to point the rootfs to the USB first partition instead of the eMMC.

If you still have the generated image from your original flash (which would be “Linux_for_Tegra/bootloader/system.img”, then you could use this instead to create your rescue device partition. You’d actually use the “bootloader/system.img.raw” so you could edit the “extlinux.conf” prior to creating the partition, such that this points the rootfs at the rescue device for rootfs instead of pointing at the eMMC.

An example of loopback mounting a system.img.raw:

cd /where/ever/it/is/Linux_for_Tegra/bootloader/
sudo mount -o loop ./system.img.raw /mnt
cd /mnt/bootloader/extlinux
# edit the extlinux.conf to name your rescue device instead of eMMC...
sudo umount /mnt

One could then use dd with the system.img.raw as the source and the rescue partition as the destination and have a rescue disk (still not guaranteed to work). A correct recursive copy of the loopback mounted system.img.raw could also be used similarly to dd to create that content on your rescue device.

Like I said though, I’m not sure if the CBoot of Xavier will actually search for alternate boot media since I cannot see how macros work in CBoot (I can’t printenv). Maybe it will work if the search macro content is there. If your USB device is searched, then this hints that it will work. Maybe your USB device would have worked if the extlinux.conf had been edited to point at the new rootfs. Thus anything you can provide about the boot from the USB device would be of interest (a serial console boot log would be very useful when using the USB device).

Just to throw a twist in this, the system.img.raw is no different than if you had cloned the Xavier (clones produce a sparse image equivalent to system.img, plus a raw image equivalent to system.img.raw). The difference is that if you create a rescue system with the system.img.raw or the rootfs/ content, then everything is default, but if you use a clone instead, then your rescue image would be an exact copy of the running system (but you could edit the clone the same way you can edit system.img.raw to point at alternate boot media…but instead you’d simply fix it by removing some unimportant content to free space…in which case you could use that for rescue or flash with that clone directly and simply have it “fixed”).

Having a clone would imply that even if you wiped out the Xavier you’d still have the full content. Clones do take a lot of space on the host PC (figure at least 32GB, possibly temporarily up to 64GB since both a raw and sparse image are created, where the sparse image approaches the size of the raw image as the filesystem fills up). A clone would free you to make mistakes and experiment without worry. This can also add over an hour for the clone to complete, but overall, I really think your clone would be the best approach since rescue media might fail, and then you’d need the clone anyway.

A typical clone would go like this:

cd ~/nvidia/nvidia_sdk/JetPack...version.../Linux_for_Tegra/
# Make sure the Xavier is connected with the USB-C and is in recovery mode.
# Make sure you have a lot of spare disk space, e.g., check "`df -H .`".
sudo ./ -r -k APP -G my_backup.img jetson-xavier mmcblk0p1

…then you will find you have files “my_backup.img” (sparse) and “my_backup.img.raw”. I throw out the sparse image and keep only the raw image. If you finish working with this, then you could compress it to save space (takes a lot of time). However, it would then be trivial to complete the rescue without losing anything even if an SD card or USB device fail as alternate boot media.