[JetPack 7.1/7.0] l4t_backup_restore.sh hangs indefinitely on nvme0n1p8 for Jetson Thor

Hi Nvidia Team

Environment :
Hardware : Official NVIDIA Jetson AGX Thor Developer Kit
Carrier Board: Official NVIDIA DevKit Carrier Board (No custom hardware)
JetPack Version: JetPack 7.1 (L4T r38.4.0) and JetPack 7.0 (L4T r38.2.1)
Host OS: Ubuntu 24.04
Storage: 256GB NVMe SSD (Internal)

Issue Description :

I have successfully flashed JetPack 7.1 onto this DevKit with a 256GB SSD using SDK Manager. The flashing process was smooth, and the system boots/operates perfectly.

But I am consistently encountering a hang issue while performing a system backup using the official l4t_backup_restore.sh tool on Jetson Thor.

The backup process proceeds correctly for the first 7 partitions (including the large 235GB APP partition nvme0n1p1), but it consistently hangs when it reaches nvme0n1p8 (PARTLABEL=“esp_alt”).

Steps to Reproduce :

Put Jetson Thor into Recovery Mode. Run on Host PC :

1.sudo systemctl stop udisks2.service

2.sudo tools/l4t_flash_prerequisites.sh

3.sudo service nfs-kernel-server start

4.sudo ./tools/backup_restore/l4t_backup_restore.sh -b jetson-agx-thor-devkit (or with -c for massflash). Observe the logs until it reaches partition 8.

Diagnostic Evidence :

Local Read Success : I SSH’ed into the Thor initrd and manually ran dd if=/dev/nvme0n1p8 of=/dev/null — it finished in seconds. This proves the DevKit can read its own disk.
NFS Deadlock : The hang occurs during the NFS write-back phase. dmesg reports:

[ 1089.673645] task:dd state:D stack:0 pid:474 tgid:474 ppid:281 flags:0x0000000c
[ 1089.683071] Call trace:
[ 1089.685514] __switch_to+0xe0/0x110
[ 1089.689007] __schedule+0x368/0xc14
[ 1089.692499] schedule+0x34/0xd8
[ 1089.695640] io_schedule+0x3c/0x64
[ 1089.699135] folio_wait_bit_common+0x170/0x340
[ 1089.703326] folio_wait_bit+0x18/0x2c
[ 1089.707166] folio_wait_writeback+0x4c/0xc4
[ 1089.711357] __filemap_fdatawait_range+0x8c/0x114
[ 1089.715898] filemap_write_and_wait_range+0xa4/0xd4
[ 1089.720787] nfs_wb_all+0x28/0x19c
[ 1089.724279] nfs4_file_flush+0xc8/0x110
[ 1089.728122] filp_flush+0x38/0xc8
[ 1089.731261] __arm64_sys_close+0x2c/0x84
[ 1089.735106] invoke_syscall+0x48/0x134
[ 1089.738947] el0_svc_common.constprop.0+0x40/0xf0
[ 1089.743488] do_el0_svc+0x1c/0x30
[ 1089.746980] el0_svc+0x30/0xb8
[ 1089.749774] el0t_64_sync_handler+0x130/0x13c
[ 1089.754314] el0t_64_sync+0x194/0x198
[ 1210.500044] INFO: task dd:474 blocked for more than 604 seconds.

Looking forward to your guidance on how to resolve this.

Hi,

Please provide the serial console log and backup log for further review.

Thanks

Hi,

Thank you for your response. As requested, I have attached the full logs captured from both the Host PC and the Jetson Thor serial console during the failed backup session.

Attached Files :

backup-thor.log : Full execution log from the Host PC side, showing the command used and exactly where it stopped.
thor-serial-console.log : Serial console log captured directly from the Jetson Thor’s debugging port, including the dmesg output during the hang.

Key Highlights from the Logs :

  • You will notice that the backup succeeds for partitions 1 through 7.
  • The hang occurs immediately when the script attempts to back up nvme0n1p8 (esp_alt).
  • In the serial log, you can clearly see the dd task being blocked for more than 120 seconds.

Please let me know if you need any further information or if there are specific tests you would like me to run based on these logs.

thor-serial-console.log (123.7 KB)

backup-thor.log (283.1 KB)

Hi,

Please try include the argument like below commands show

sudo ./tools/backup_restore/l4t_backup_restore.sh -e nvme0n1 -b jetson-agx-thor-devkit

Thanks

Hi NVIDIA Team,

Thanks for the suggestion. I tried the command with -e nvme0n1, but the backup still hangs at Partition 8.
The Call Trace shows the dd process is consistently blocked at nfs4_file_flush and nfs_wb_all.
It appears to be a writeback/locking issue during the transport phase.

I would appreciate any guidance or further insights you could provide to help resolve this.

Thanks.

Hi,

We confirm that backups can be created using the mentioned commands in our developer kits.

To clarify a few details:

  • Does your host PC have sufficient disk space to store the backup image?
  • Have you tried connecting the device using a different USB port on the host PC?

Thanks

Hi NVIDIA Team,

Thanks for the reply. I can confirm that my host PC has sufficient disk space (over 500GB free) and I have tested different USB ports with high-quality cables. The issue persists regardless of these factors.

My Host PC is running Ubuntu 24.04 LTS.

It seems the backup hanging issue is caused by an NFSv4 locking/lease timeout when writing large partition images from the Jetson (initrd client) to the Host PC (NFS server).

The default l4t_backup_restore.sh lets the client negotiate the NFS version, and Ubuntu 24.04’s NFS server defaults to v4, which seems unstable for this specific high-throughput scenario.

I’m not sure if this is the correct fix, but forcing NFSv3 on the client side has at least resolved the current backup timeout issue for me.

Change applied in l4t_backup_restore.sh (around line 180):

Original:
mount -o nolock ${hostip}:${nfs_folder} /mnt

Modified (Working):
mount -o nolock,nfsvers=3 ${hostip}:${nfs_folder} /mnt

With nfsvers=3, the backup completes successfully every time without hanging.

Thanks.

Hi,

Thank you for your feedback.

We will set up Ubuntu 24.04, reproduce the error, and test your solution on our end.

Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.