Problem shutting down the system with device mapper (i think)

Hello everyone,

I know this is probably not that related to a linux 4 tegra issue, but since i’m getting a kernel stack trace, i try posting here in the hope that someone can drive me into the right direction to fix my issue.

Here is the scenario

I have a linux installation with a luks encrypted partition mounted on /mnt which is RW and a rootfs that is also luks encrypted mounted read only.

Whenever i try shutting down the system, the system never powers down, everything freezes (like i cant interact with the serial terminal anymore via keyboard) and after about 120 secs, which is (i think) the systemd timeout
I receive this kernel trace

[ 1088.638263] INFO: task systemd-shutdow:1 blocked for more than 120 seconds.
[ 1088.645526]       Not tainted 4.9.337-unox #1
[ 1088.650010] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1088.658136] Call trace:
[ 1088.660630] [<000000003ae63707>] __switch_to+0x9c/0xc0
[ 1088.665837] [<000000007c73338d>] __schedule+0x23c/0x7e0
[ 1088.671268] [<00000000e285f148>] schedule+0x40/0xa8
[ 1088.676232] [<00000000402afbdd>] bdi_unregister+0x1e8/0x230
[ 1088.681839] [<0000000041b6e526>] blk_cleanup_queue+0xf0/0x118
[ 1088.687599] [<0000000003df533d>] cleanup_mapped_device+0x9c/0xc8
[ 1088.693603] [<00000000f9a7e1ec>] __dm_destroy+0x148/0x218
[ 1088.698997] [<000000000d45da0b>] dm_destroy+0x24/0x30
[ 1088.704043] [<0000000021ea91a5>] dev_remove+0x100/0x160
[ 1088.709261] [<00000000aae7b94b>] ctl_ioctl+0x25c/0x548
[ 1088.714394] [<000000005d5d6d17>] dm_ctl_ioctl+0x28/0x38
[ 1088.719613] [<00000000eff7fa91>] do_vfs_ioctl+0xb0/0x8d8
[ 1088.725105] [<0000000068610fe2>] SyS_ioctl+0x8c/0xa8
[ 1088.730096] [<00000000ad48cfec>] el0_svc_naked+0x34/0x38

Well, i’m not a kernel expert but according to the trace i think it has something to do with the device mapper, it cant destroy it for some reasons?

i’ve tried the following in the hope it would help

  • forcing a kill of every process using /mnt before shutting down thinking it might have been some process holding the device busy
  • remounting /mnt as read only before shutting down also didn’t help

If i add a watchdog
ShutdownWatchdogSec=10s
in /etc/systemd/system.conf i can REBOOT the system from the command line, as it reaches the timeout and the system is forcefully rebooting, but this doesn’t seem to work in the shutdown process.

If anyone has any siggestion or know if this might be a bug in the specific kernel version i’m using, or anything (boot arguments, whatsoever i don’t know), please let me know how can i investigate any further.

the kernel version is 4.9.337 built from the linux for tegra sources repository

thanks a lot :)

Maybe attached the UART to check the kernel log before system stuck.

Thanks

In addition to the UART log I will point out some details related to this (but maybe not all that useful although understanding it might help solve the mystery)…

During shutdown the disk has to stop all processes which depend on it. Tasks which read or write files must be terminated. You might see this issue if either (A) the device mapper itself hangs (the device mapper is sort of a table of how to access filesystems), or if (B) something on the filesystem is hung. More directly, it can be the device mapper which hung, or it can be something else not releasing the filesystem, and indirectly the DM wouldn’t be able to halt.

It becomes fairly important (if you are going to debug this and not just flash) to identify what process is actually hung. Is it the DM? Is it some program accessing the filesystem? This is probably unrelated to the Jetson itself, but it could also be a quirk of how LUKS works on a Jetson.

The following experiment might offer some clues if you choose to debug rather than flashing; I will explain the Magic SysRq system.

Kernel developers have often needed certain “critical situation” controls while working on something which might go wrong, e.g., to gather more information, or to avoid data loss during some bugs. Like the serial console, these calls to the kernel are very very difficult to make them fail. These “magic key bindings” also work with the echo of the characters to the “/proc/sysrq-trigger” file. To work with this you need either a local keyboard, or a logged in console (such as serial console). Using echo requires sudo (or root login), whereas the keyboard does not require any special permission.

Incidentally, if I use the notation “ALT-SYSRQ-...something...”, it means hold down the ALT key, then hold down the SysRq key (which is the unshifted print screen, PRTSCN key, usually above the “INS” and “DEL” keys), then tap the “…something…” key. Let go of all keys. It is sort of like CTRL-ALT-DEL, but it is ALT-SYSRQ-key. Or, user root can “echo ...something... > /proc/sysrq-trigger”. The two are interchangeable.

A common scenario is that the end user wants to sync the unwritten data to disk before doing something risky. The magic sysrq key for that is the ‘s’ key. This is a good way to check out how it works. If you monitor “dmesg --follow” (preferably on serial console), and then on a locally attached keyboard use the keystrokes ALT-SYSRQ-s, you should see a message about emergency sync if it is enabled (by default it is enabled). Try it, and see if you see an emergency sync message.

If that works, then this might be the normal way to shut down safely if you expect a filesystem failure:

# Do this twice:
ALT-SYSRQ-s
ALT-SYSRQ-s

# Do this to unmount filesystems:
ALT-SYSRQ-u

# Do this to force reboot:
ALT-SYSRQ-b

The same thing with echos:

# Become root:
sudo -s

# sync twice:
echo s > /proc/sysrq-trigger
echo s > /proc/sysrq-trigger

# unmount:
echo u > /proc/sysrq-trigger

# boot:
echo b > /proc/sysrq-trigger

This should allow you to not have filesystem corruption. Furthermore, you can do this with only the serial console login to root (sudo), which means the GUI can be logged out of. If no regular user has ever logged in other than to serial console to run sudo, and if the problem does not show up, then likely the process which is hanging is somewhere in user space and something which started as a result of user login. If this still occurs, and it does so even with no other login (other than root to use echo; a local keyboard is even better, but then how would you monitor log messages?). I guess you could log in on serial console and see messages, and use the local keyboard to run those key bindings. Try whatever you can get to and note when you see the failure to umount, and whether any logins might have started up a process.

Hello
thanks for your reply.

this way of course i can remount read only the /mnt filesystem and perform a reboot of the system.

Still it remains a mystery on why the system refuses to shutdown cleanly, and when i use a traditional reboot command, it has to rely on watchdog to timeout and force the reboot.

It’s something related to cryptsetup for sure… meh :(

i’m using ssh to send commands and UART to monitor the dmesg

[  282.658063] Emergency Sync complete
[  283.355568] sysrq: Emergency Sync
[  283.355568] sysrq: Emergency Sync
[  283.359006] Emergency Sync complete
[  284.713276] sysrq: Emergency Remount R/O
[  284.713276] sysrq: Emergency Remount R/O
[  284.726171] EXT4-fs (dm-1): re-mounted. Opts: (null)
[  284.728839] Emergency Remount complete

and then

# shutdown -h now
# Connection to 192.168.1.111 closed by remote host.

in the debug i get

Broadcast message from root@digitalid on pts/0 (Thu 2023-12-07 18:16:17 CET):

The system will power off now!

[FAILED] Failed unmounting mnt.mount - /mnt.
[  484.479280] INFO: task systemd-shutdow:1 blocked for more than 120 seconds.
[  484.486552]       Not tainted 4.9.337-unox #1
[  484.491041] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  484.498957] Call trace:
[  484.501417] [<000000000e6071ec>] __switch_to+0x9c/0xc0
[  484.506761] [<000000002a1979d2>] __schedule+0x23c/0x7e0
[  484.512279] [<000000005c189028>] schedule+0x40/0xa8
[  484.517204] [<00000000ba04fcb4>] bdi_unregister+0x1e8/0x230
[  484.522983] [<0000000078c7191e>] blk_cleanup_queue+0xf0/0x118
[  484.528818] [<00000000bb58d830>] cleanup_mapped_device+0x9c/0xc8
[  484.534859] [<00000000722d1a88>] __dm_destroy+0x148/0x218
[  484.540272] [<0000000037dd1770>] dm_destroy+0x24/0x30
[  484.545326] [<000000005fda8ead>] dev_remove+0x100/0x160
[  484.550546] [<0000000064df0f75>] ctl_ioctl+0x25c/0x548
[  484.555867] [<00000000e0c7aad2>] dm_ctl_ioctl+0x28/0x38
[  484.561130] [<00000000dee0020b>] do_vfs_ioctl+0xb0/0x8d8
[  484.566443] [<00000000124e41d5>] SyS_ioctl+0x8c/0xa8
[  484.571405] [<000000008686f01b>] el0_svc_naked+0x34/0x38

further testing by killing processes holding /mnt busy using fuser

~# fuser -km /mnt/*
~# umount /mnt/
~# shutdown -h now

this way /mnt is unmounted, the error about unmounting /mnt doesn’t appear anymore
but the system doesn’t shut down.

i still receive the dmsetup error hanging the kernel but no kernel stack trace anymore

[  242.815476] INFO: task dmsetup:3422 blocked for more than 120 seconds.
[  242.822571]       Not tainted 4.9.337-unox #1
[  242.827569] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

I believe you are right about it being related to the encrypted filesystem. For now, if you complete the sync twice, and then remount read-only, you’ll be safe to cut power or use the button to force power off or power reset.

Here is a subtle part of the test: Is the behavior exactly the same when you go read-only, and then shut down with the ALT-SYSRQ-b (or “sudo echo b > /proc/sysrq-trigger”)? I suspect it is. However, if it differs, it might be related to user space programs. If they are the same, then I think the issue is entirely within the encrypted filesystem drivers and something talking to the partition which is not in user space.

I don’t know if this will find the process which is hung, but after you are sync and read-only, with the serial console logging, can you run this command and post the output:
ps ax --forest -o pid,state,tname,ppid,command

I’m hoping to look for odd states, especially zombie processes.

Sorry i’m getting back to you only after so long.

I had personal issues that kept me away from work for a bunch of days.

I did some more research and i think that there is definitely something very wrong with cryptsetup.

In fact i’ve managed to manually shut down every process, cleanly unmount the encrypted filesystem.

But then when i manually run
cryptsetup luksClose it completely hangs and the kernel starts spitting out those errors

root@digitalid:/etc/systemd/system# kill -TERM 2797 2804 2887 2888 2932
root@digitalid:/etc/systemd/system# fuser -m /mnt/*
root@digitalid:/etc/systemd/system# fuser -m /mnt/*
root@digitalid:/etc/systemd/system# fuser -m /mnt/*
root@digitalid:/etc/systemd/system# umount /mnt/
root@digitalid:/etc/systemd/system# cryptsetup luksClose /dev/mapper/mmcblk0p18_crypt

and in the serial console after a while

[  967.818185] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1088.638524] INFO: task cryptsetup:4589 blocked for more than 120 seconds.
[ 1088.645622]       Not tainted 4.9.337-unox #1

so… yeah, I’m fairly confident the issue stands in cryptsetup or how cryptsetup talks with the device mapper kernel module / whatever.

Do you know if there is any developer / ML or something i can try to expose this problem?

Since this is the last kernel provided by nvidia, i’m not expecting any sort of upgrade that might eventually fix the problem

Something like DCELL or JTAG debugging is quite difficult and expensive. Next to that is KGDB and KGDBOC, but that too is pretty extreme on the learning curve. What you might consider is increasing log level. This could be of interest:
https://linuxconfig.org/introduction-to-the-linux-kernel-log-levels

If you could reproduce this on an RPi using roughly the same kernel version, then the regular kernel developers would probably be able to work on it (the RPi is arm64 as well). Otherwise someone would have to have the specific Nano and kernel to work on this. Probably far out there is just checking the Ubuntu bug lists and see if there is any cryptsetup or device mapper bug which is similar (in which case it isn’t architecture-dependent and perhaps it is already fixed in an upstream kernel).

In fact my plan was to try out new kernel versions.

but i don’t know if there is any chance of running a new kernel on this jetson nano device.

Do you know if the kernel from newer jetpacks works on the nano? not the whole toolchain nor the jetpack or rootfs.
the kernel would suffice for testing. Do you also know how to fetch a newer kernel from the nvidia repos?

thanks very much,

Newer kernels won’t work. If you were really good with kernel porting, then you could probably get something to work, but you might still be missing some functionality. It isn’t until Orin, with the developer preview flash software, that alternate kernel versions, e.g., from mainline, is supported.

You can find what your current L4T release is with “head -n 1 /etc/nv_tegra_release”. You can then find the web page for a specific L4T release here:
https://developer.nvidia.com/linux-tegra

Within that is a source download. The kernel source is actually a compressed tar archive within the source download. So you’d extract the kernel source file from the downloaded source, and then extract the kernel from that extracted file.

There has been a git repo for this, but I’m not sure what its status is across the different releases. The L4T URL pages always have the correct kernel source for that release. I would not expect the releases for any major release after L4T R32.x to work.

the version i’m using is 4.9.337, i pulled it using the sync_source script.

unfortunately from the available documentation it’s really hard to tell if that’s the very latest version available for the nano.

I’ve often viewed the git version difficult to know what the correct kernel is. The link to the L4T releases in my previous post will always be correct though. You’d want the “driver package sources”, and then extract the kernel (which is yet another .tbz file) from that.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.