Recovering AGX Xavier from boot failure without reflash (NVIDIA Persistence Deamon failure)

I am using AGX Xavier Dev Kit.

While messing with cuda versions it seems like I updated the graphics driver to a non compatible version (I assume thats the cause) I would like to recover my work from the board before re-flashing. It hangs up on boot with a message NVIDIA Persistence Deamon Failed and the screen turns on and off. These systems do not have GRUB. Is there a way to enter recovery mode to move the files? Or is there a way to fix the driver issue? Or is there a way to reflash without wiping the drive contents? I need access to /home/nvidia/.

Regards,
Marcus

If the Xavier is in recovery mode, and connected to a Linux host PC, then the Xavier becomes a custom device (appropriately enough, the “driver package” on the host PC would understand that device…if you ever flashed or ran JetPack/SDK Manager, then you already have this). Recovery mode is normally used to flash, but can also do the reverse: Recovery mode can clone.

FYI, when you clone you will get two large files. One is a “raw” file, and is the full size of the partition. If for example the partition is 28GB, then the host PC will get a file that big. The “sparse” file is much smaller, basically skipping parts of the filesystem which are empty. If the filesystem is near full, then the sparse file will be as large as the raw file. The sparse file can only be used to flash, and cannot be examined or manipulated. I advise throwing away the sparse file and saving only the raw file. If for some reason you want to save this, then you could for example run “bzip2 -9” on the raw file to reduce its size (but this will take a lot of time and CPU power). Just copying a file that large to another location takes a lot of time. The gist is that you should plan on about 50GB of free space prior to cloning.

Cloning will not care much about which release you use to clone. Restoring should be done with the same L4T/JetPack/SDK Manager release which originally created the content the clone is from.

Have you flashed before? Then you probably have:
~/nvidia/nvidia_sdk/JetPack_...some version.../Linux_for_Tegra/

The above is where you will find “flash.sh”. Version may change the required command, but for most releases this will clone the rootfs if the USB-C is connected and the Jetson is in recovery mode (you can verify recovery mode is visible to the host via “lsusb -d 0955:7019”. The basic command is:
sudo ./flash.sh -r -k APP -G my_backup.img jetson-xavier mmcblk0p1
…then delete “my_backup.img”, but keep “my_backup.img.raw”.

To see how the clone works from the host PC via loopback mount:

sudo mount -o loop ./my_backup.img.raw /mnt
cd /mnt
# Do things...
ls
cd -
sudo umount /mnt

You can add or remove files, or just use this to flash with. Be very careful when flashing, as it is tempting with such a large file to have it in only one place, and if you forget the “-r” option during flash, then you will overwrite your image. To command line flash with your clone:

  1. Copy my_backup.img.raw to:
    Linux_for_Tegra/bootloader/system.img
  2. From “Linux_for_Tegra/”, with the Xavier freshly in recovery mode (you cannot use recovery mode for two operations in a row without reset) and visible with the USB-C:
    sudo ./flash.sh -r jetson-xavier mmcblk0p1.

The “-r” option resuses system.img without creating a new one. It won’t matter if that file is raw or sparse, it will work. However, expect a raw file to take longer to flash.