EXT4-fs error (device nvme0n1p1):

I have a Jetson NX Orin within an enclosure from Forecr and Jetpack 5.1 loaded on a m.2 SSD. It’s been fine for a month or so, but suddenly the system can only be pinged and nothing else. If I’m already accessing via SSH, then I can see the prompt but can’t do any commands, not even things like ls or ifconfig. If I reboot the system, it will be fine for another 30 mins or so. The system is passively cooled and gets warm, but not warmer than it’s been in the past, where it worked fine.

[ 7212.193103] EXT4-fs error (device nvme0n1p1): __ext4_find_entry:1534: inode #1048735: comm docker: reading directory lblock 0
[ 7215.607411] EXT4-fs warning: 35 callbacks suppressed
[ 7215.607418] EXT4-fs warning (device nvme0n1p1): dx_probe:767: inode #1572866: lblock 0: comm dbus-daemon: error -5 reading directory block
[ 7215.625396] EXT4-fs warning (device nvme0n1p1): dx_probe:767: inode #2248066: lblock 0: comm dbus-daemon: error -5 reading directory block
[ 7215.691533] EXT4-fs error: 36 callbacks suppressed
[ 7215.691538] EXT4-fs error (device nvme0n1p1): __ext4_find_entry:1534: inode #786438: comm python3: reading directory lblock 0
[ 7216.309469] EXT4-fs error (device nvme0n1p1): __ext4_find_entry:1534: inode #1048735: comm docker: reading directory lblock 0
[ 7216.365329] EXT4-fs error (device nvme0n1p1): __ext4_find_entry:1534: inode #1048735: comm gmain: reading directory lblock 0
[ 7216.636095] EXT4-fs warning (device nvme0n1p1): dx_probe:767: inode #3145729: lblock 0: comm sudo: error -5 reading directory block
[ 7216.648313] EXT4-fs warning (device nvme0n1p1): dx_probe:767: inode #3145729: lblock 0: comm sudo: error -5 reading directory block
[ 7216.660541] EXT4-fs error (device nvme0n1p1): __ext4_find_entry:1534: inode #3152181: comm sudo: reading directory lblock 0
[ 7216.672021] EXT4-fs error (device nvme0n1p1): __ext4_find_entry:1534: inode #3152181: comm sudo: reading directory lblock 0
[ 7216.683491] EXT4-fs error (device nvme0n1p1): __ext4_find_entry:1534: inode #3152181: comm sudo: reading directory lblock 0
[ 7216.694972] EXT4-fs error (device nvme0n1p1): __ext4_find_entry:1534: inode #3152181: comm sudo: reading directory lblock 0
[ 7216.706437] EXT4-fs error (device nvme0n1p1): __ext4_find_entry:1534: inode #3152181: comm sudo: reading directory lblock 0
[ 7216.709203] EXT4-fs error (device nvme0n1p1): __ext4_find_entry:1534: inode #786438: comm python3: reading directory lblock 0
[ 7216.717909] EXT4-fs error (device nvme0n1p1): __ext4_find_entry:1534: inode #3152181: comm sudo: reading directory lblock 0
[ 7216.741068] EXT4-fs warning (device nvme0n1p1): dx_probe:767: inode #2248067: lblock 0: comm sudo: error -5 reading directory block
[ 7216.753267] EXT4-fs warning (device nvme0n1p1): dx_probe:767: inode #2248067: lblock 0: comm sudo: error -5 reading directory block
[ 7216.765459] EXT4-fs warning (device nvme0n1p1): dx_probe:767: inode #2248067: lblock 0: comm sudo: error -5 reading directory block
[ 7216.777642] EXT4-fs warning (device nvme0n1p1): dx_probe:767: inode #2248067: lblock 0: comm sudo: error -5 reading directory block
[ 7216.789829] EXT4-fs warning (device nvme0n1p1): dx_probe:767: inode #2248067: lblock 0: comm sudo: error -5 reading directory block
[ 7216.802010] EXT4-fs warning (device nvme0n1p1): dx_probe:767: inode #2248067: lblock 0: comm sudo: error -5 reading directory block
[ 7220.733782] EXT4-fs error: 44 callbacks suppressed
[ 7220.733787] EXT4-fs error (device nvme0n1p1): __ext4_find_entry:1534: inode #786438: comm python3: reading directory lblock 0
[ 7221.241141] EXT4-fs warning: 31 callbacks suppressed
[ 7221.241146] EXT4-fs warning (device nvme0n1p1): dx_probe:767: inode #3145729: lblock 0: comm gmain: error -5 reading directory block
[ 7221.241167] EXT4-fs error (device nvme0n1p1): __ext4_find_entry:1534: inode #786438: comm gmain: reading directory lblock 0
[ 7221.269999] EXT4-fs error (device nvme0n1p1): __ext4_find_entry:1534: inode #786438: comm gmain: reading directory lblock 0
[ 7221.281502] EXT4-fs error (device nvme0n1p1): __ext4_find_entry:1534: inode #786438: comm gmain: reading directory lblock 0
[ 7221.455425] EXT4-fs error (device nvme0n1p1): __ext4_find_entry:1534: inode #1048735: comm docker: reading directory lblock 0
[ 7221.751458] EXT4-fs error (device nvme0n1p1): __ext4_find_entry:1534: inode #786438: comm python3: reading directory lblock 0
[ 7222.365044] EXT4-fs error (device nvme0n1p1): __ext4_find_entry:1534: inode #786438: comm gmain: reading directory lblock 0
[ 7222.376539] EXT4-fs error (device nvme0n1p1): __ext4_find_entry:1534: inode #2375046: comm gmain: reading directory lblock 0
[ 7222.388118] EXT4-fs warning (device nvme0n1p1): dx_probe:767: inode #1572866: lblock 0: comm gmain: error -5 reading directory block
[ 7222.400438] EXT4-fs error (device nvme0n1p1): __ext4_find_entry:1534: inode #1048752: comm gmain: reading directory lblock 0
[ 7222.494287] EXT4-fs error (device nvme0n1p1): __ext4_find_entry:1534: inode #1048735: comm docker: reading directory lblock 0
[ 7222.817633] EXT4-fs warning (device nvme0n1p1): dx_probe:767: inode #3145729: lblock 0: comm sudo: error -5 reading directory block
[ 7222.829869] EXT4-fs warning (device nvme0n1p1): dx_probe:767: inode #3145729: lblock 0: comm sudo: error -5 reading directory block
[ 7222.842184] EXT4-fs warning (device nvme0n1p1): dx_probe:767: inode #2248067: lblock 0: comm sudo: error -5 reading directory block
[ 7222.854369] EXT4-fs warning (device nvme0n1p1): dx_probe:767: inode #2248067: lblock 0: comm sudo: error -5 reading directory block
[ 7222.866562] EXT4-fs warning (device nvme0n1p1): dx_probe:767: inode #2248067: lblock 0: comm sudo: error -5 reading directory block
[ 7222.876322] EXT4-fs warning (device nvme0n1p1): dx_probe:767: inode #1572866: lblock 0: comm dbus-daemon: error -5 reading directory block
[ 7222.878746] EXT4-fs warning (device nvme0n1p1): dx_probe:767: inode #2248067: lblock 0: comm sudo: error -5 reading directory block
[ 7222.891616] EXT4-fs warning (device nvme0n1p1): dx_probe:767: inode #2248066: lblock 0: comm dbus-daemon: error -5 reading directory block

Here’s a video showing the dmsg at the time of failure. Its everything from “timeout, aborting”

That might be an actual SSD failure. Regardless of whether or not this is a hardware failure, I think it’d be far easier if you can mount this onto another Linux computer as a secondary drive. Do you have a way to connect this to a Linux host PC? Or even an external USB m.2 SSD dock?

Hello @alex247

Can you please share us the SSD model on your carrier board?
Also have you tested the same carrier board with another NVME SSD?

Hi @mehmetdeniz The 500GB SSD was the one that was provided by you within the DSBOX-ORNX. If I were to guess, I think it may have been a WD Blue SSD (perhaps SN570?). I haven’t tried a different SSD yet. We usually opt for a Samsung 980 pro but I have 980 (non-pro) available that I could flash

That’s right. It should SN570.
Have you powered it off while the system working on load (some Docker-Python based applications etc.)?

It typically runs fine for about 30 minutes, and then the issue begins. It was fine for a few weeks before this issue started.

It will have been powered off while under load at some point, yes.

We can advise you for using overlayroot to prevent file system corruption for unexpected power-off scenarios

To do that, you need to reflash your Jetson. Is that suitable for your use case?

If something with the root fs had gone wrong due to sudden shutdown, would I still be able to run it successfully for 30 mins ?
Would I still be able to save things with this enabled ? Yes I can reflash In this instance.

See:
https://forums.developer.nvidia.com/t/tuning-linux-on-jetson-nano-for-better-data-reliability-in-power-failure-scenario/252664/5

The shorter answer is that this is a full PC. Imagine if your regular PC is under heavy load, and to turn it off, you yank the power cord. Same thing.

Thanks for that. So overlay root wouldn’t be a solution for us if we can’t write anything.

I’m still curious as to how the system works with no issues and then suddenly completely falls over. Is that expected if the FS is corrupted?

Seeing as it runs fine for awhile, is there any form of “repair” or check that I should try before resorting to a reflash ?

OverlayFS does write, but it only writes “edits” in RAM. It works if you have enough RAM, and don’t need the content after a reboot (the original filesystem is read-only).

This is normal for all computers other than special synchronous devices.

What you are running into is a limitation of the “journal” size in a journaling filesystem. For a more details explanation, see:
https://forums.developer.nvidia.com/t/tuning-linux-on-jetson-nano-for-better-data-reliability-in-power-failure-scenario/252664/5

Note that something to pay attention to while reading is that lost data is not the same as a corrupted filesystem. Lost data is basically just truncated content, whereas corruption alters the structure of the filesystem such that any write would further destroy it. Repairs, once in that bad of a state, are manual because it is up to the end user what parts of the filesystem will be further cut out (and it can be large parts). My advice is to completely reinstall, there were so many errors that it is hard to distinguish from a failed drive.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.