Dear ,
How to enable watchdog in kernel space, and how can I confirm watchdog is enabled ?
I have read and followed the doc in <NVIDIA_Tegra_Linux_Driver_Package> → → Watchdog Timer, but I don’t sure the watchdog is enabled in kernel space?
I am using R28.2.1 (Jetpack 3.3), Kernel config list: (tegra18_defconfig)
CONFIG_WATCHDOG=y
CONFIG_WATCHDOG_NOWAYOUT=y
CONFIG_TEGRA_WATCHDOG=y
And now,
sudo echo 1 > /dev/watchdog0 (It’ will reboot after 120s)
Thank you.
Dear ShaneCCC,
Thanks for your information, I have seen the topics, but I want to know how to confirm WDT0 is enabled in kernel space?
Can I confirm the kernel space WDT is enabled through the two:
- sudo echo 1 > /dev/watchdog0 (It’ will reboot after 120s)
- Kernel config list: (tegra18_defconfig)
CONFIG_WATCHDOG=y
CONFIG_WATCHDOG_NOWAYOUT=y
CONFIG_TEGRA_WATCHDOG=y
I haven’t looked at anything for watchdog inside of the kernel, but you might start with the kernel source file:
Documentation/ABI/testing/sysfs-class-watchdog
My thought is that if you know what you want from export to sysfs, then you can track that down within the actual watchdog code and see if there is some convenient way to hook into it (or just monitor a sysfs file).
Dear linuxdev & ShaneCCC,
Thanks for your replay, my problem is that the watchdog was not triggered even if the system is hanged(plug-in USB device, it can not be recogonized, tty can’t login).
Here is the kern.log when the system hanged because of out of memory, but in the mean time the watchdog wasn’t triggered to reboot Ubuntu.
Thanks fo much for your patient.
v 30 16:53:49 tegra-ubuntu kernel: [ 121.869967] omxh264dec-omxh invoked oom-killer: gfp_mask=0x24201ca, order=0, oom_score_adj=0
Nov 30 16:53:49 tegra-ubuntu kernel: [ 121.880273] omxh264dec-omxh cpuset=/ mems_allowed=0
Nov 30 16:53:49 tegra-ubuntu kernel: [ 121.886626] CPU: 0 PID: 3793 Comm: omxh264dec-omxh Not tainted 4.4.38-tegra #1
Nov 30 16:53:49 tegra-ubuntu kernel: [ 121.895634] Hardware name: jetson_tx1 (DT)
Nov 30 16:53:49 tegra-ubuntu kernel: [ 121.900634] Call trace:
Nov 30 16:53:49 tegra-ubuntu kernel: [ 121.903973] [<ffffffc000088fc8>] dump_backtrace+0x0/0xf4
Nov 30 16:53:49 tegra-ubuntu kernel: [ 121.910200] [<ffffffc0000890d0>] show_stack+0x14/0x1c
Nov 30 16:53:49 tegra-ubuntu kernel: [ 121.916162] [<ffffffc0003769ac>] dump_stack+0xac/0xe4
Nov 30 16:53:49 tegra-ubuntu kernel: [ 121.922110] [<ffffffc0001c8460>] dump_header.isra.7+0x60/0x1a4
Nov 30 16:53:49 tegra-ubuntu kernel: [ 121.928835] [<ffffffc00016fe24>] oom_kill_process+0x94/0x434
Nov 30 16:53:49 tegra-ubuntu kernel: [ 121.935378] [<ffffffc0001704dc>] out_of_memory+0x298/0x2e0
Nov 30 16:53:49 tegra-ubuntu kernel: [ 121.941739] [<ffffffc00017517c>] __alloc_pages_nodemask+0xa64/0xa8c
Nov 30 16:53:49 tegra-ubuntu kernel: [ 121.948880] [<ffffffc00016ec38>] filemap_fault+0x308/0x47c
Nov 30 16:53:49 tegra-ubuntu kernel: [ 121.955236] [<ffffffc00024af18>] ext4_filemap_fault+0x34/0x50
Nov 30 16:53:49 tegra-ubuntu kernel: [ 121.961848] [<ffffffc000196adc>] __do_fault+0x3c/0xa4
Nov 30 16:53:49 tegra-ubuntu kernel: [ 121.967755] [<ffffffc00019a5bc>] handle_mm_fault+0x764/0x154c
Nov 30 16:53:49 tegra-ubuntu kernel: [ 121.974359] [<ffffffc00009912c>] do_page_fault+0x1a8/0x454
Nov 30 16:53:49 tegra-ubuntu kernel: [ 121.980699] [<ffffffc000080b98>] do_mem_abort+0x40/0x9c
Nov 30 16:53:49 tegra-ubuntu kernel: [ 121.986775] [<ffffffc000080c68>] do_el0_ia_bp_hardening+0x74/0x7c
Nov 30 16:53:49 tegra-ubuntu kernel: [ 121.993722] [<ffffffc000084478>] el0_ia+0x18/0x1c
Nov 30 16:53:49 tegra-ubuntu kernel: [ 121.999778] Mem-Info:
Nov 30 16:53:49 tegra-ubuntu kernel: [ 122.002962] active_anon:475879 inactive_anon:33311 isolated_anon:0
Nov 30 16:53:49 tegra-ubuntu kernel: [ 122.002962] active_file:2120 inactive_file:2908 isolated_file:32
Nov 30 16:53:49 tegra-ubuntu kernel: [ 122.002962] unevictable:0 dirty:0 writeback:0 unstable:0
Nov 30 16:53:49 tegra-ubuntu kernel: [ 122.002962] slab_reclaimable:5119 slab_unreclaimable:10093
Nov 30 16:53:49 tegra-ubuntu kernel: [ 122.002962] mapped:31973 shmem:33650 pagetables:2877 bounce:0
Nov 30 16:53:49 tegra-ubuntu kernel: [ 122.002962] free:4186 free_pcp:337 free_cma:2881
Nov 30 16:53:49 tegra-ubuntu kernel: [ 122.042006] DMA free:13448kB min:4068kB low:5084kB high:6100kB active_anon:977948kB inactive_anon:65980kB active_file:8136kB inactive_file:9532kB unevictable:0kB isolated(anon):0kB isolated(file):128kB present:2076672kB managed:2055988kB mlocked:0kB dirty:0kB writeback:0kB mapped:65056kB shmem:66560kB slab_reclaimable:7344kB slab_unreclaimable:14108kB kernel_stack:3232kB pagetables:5668kB unstable:0kB bounce:0kB free_pcp:1064kB local_pcp:120kB free_cma:11524kB writeback_tmp:0kB pages_scanned:106708 all_unreclaimable? yes
Nov 30 16:53:49 tegra-ubuntu kernel: [ 122.094337] lowmem_reserve[]: 0 1976 1976
Nov 30 16:53:49 tegra-ubuntu kernel: [ 122.099465] Normal free:3792kB min:4004kB low:5004kB high:6004kB active_anon:925568kB inactive_anon:67264kB active_file:88kB inactive_file:1824kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2097152kB managed:2023448kB mlocked:0kB dirty:0kB writeback:0kB mapped:62836kB shmem:68040kB slab_reclaimable:13132kB slab_unreclaimable:26264kB kernel_stack:4992kB pagetables:5840kB unstable:0kB bounce:0kB free_pcp:372kB local_pcp:236kB free_cma:0kB writeback_tmp:0kB pages_scanned:29780 all_unreclaimable? yes
Nov 30 16:53:49 tegra-ubuntu kernel: [ 122.153539] lowmem_reserve[]: 0 0 0
Nov 30 16:53:49 tegra-ubuntu kernel: [ 122.158150] DMA: 264*4kB (UMEC) 156*8kB (UMC) 14*16kB (UMC) 3*32kB (C) 13*64kB (C) 14*128kB (C) 9*256kB (C) 4*512kB (C) 4*1024kB (C) 0*2048kB 0*4096kB = 13696kB
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.176370] Normal: 770*4kB (UME) 89*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3792kB
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.190734] 38084 total pagecache pages
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.195747] 0 pages in swap cache
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.200282] Swap cache stats: add 0, delete 0, find 0/0
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.206714] Free swap = 0kB
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.210809] Total swap = 0kB
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.214849] 1043456 pages RAM
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.218942] 0 pages HighMem/MovableOnly
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.224014] 23597 pages reserved
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.228370] 4096 pages cma reserved
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.232927] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.243622] [ 239] 0 239 5191 1332 9 3 0 0 systemd-journal
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.255196] [ 249] 0 249 19420 42 5 3 0 0 lvmetad
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.265959] [ 282] 0 282 3090 337 8 3 0 -1000 systemd-udevd
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.277283] [ 410] 100 410 19326 61 6 3 0 0 systemd-timesyn
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.288626] [ 430] 0 430 77569 388 18 4 0 0 ModemManager
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.301033] [ 432] 0 432 985 100 5 3 0 0 systemd-logind
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.312237] [ 434] 110 434 1484 114 6 3 0 0 avahi-daemon
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.323119] [ 439] 0 439 441 20 4 3 0 0 runsvdir
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.333582] [ 445] 105 445 1660 282 6 3 0 -900 dbus-daemon
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.344258] [ 456] 110 456 1454 79 6 3 0 0 avahi-daemon
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.355025] [ 499] 0 499 1601 68 6 3 0 0 cron
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.365097] [ 507] 0 507 3914 286 11 3 0 0 cupsd
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.375289] [ 510] 0 510 41651 309 15 3 0 0 cups-browsed
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.386157] [ 512] 108 512 56463 221 10 3 0 0 rsyslogd
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.396598] [ 547] 121 547 1702 85 6 3 0 0 gpsd
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.406819] [ 549] 0 549 1965 199 7 3 0 0 ofonod
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.417064] [ 568] 0 568 58226 282 15 3 0 0 accounts-daemon
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.428131] [ 590] 7 590 2905 184 8 3 0 0 dbus
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.438235] [ 982] 111 982 2456 95 7 3 0 0 dnsmasq
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.448525] [ 995] 0 995 57844 521 14 3 0 0 polkitd
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.458900] [ 1173] 106 1173 1632 173 6 3 0 0 systemd
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.469266] [ 1175] 106 1175 2800 608 8 3 0 0 (sd-pam)
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.479629] [ 1203] 106 1203 21398 108 9 3 0 0 gnome-keyring-d
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.490623] [ 1499] 0 1499 2441 165 9 3 0 -1000 sshd
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.501749] [ 1501] 0 1501 133819 3063 50 5 0 -500 dockerd
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.512116] [ 1503] 0 1503 3073 213 9 3 0 0 argus_daemon
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.522905] [ 1514] 0 1514 8781 448 19 3 0 0 nvcamera-daemon
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.534041] [ 1516] 107 1516 61267 388 20 3 0 0 whoopsie
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.544466] [ 1550] 106 1550 2253 158 7 3 0 0 upstart
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.554789] [ 1620] 116 1620 38141 71 9 3 0 0 rtkit-daemon
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.565638] [ 1632] 0 1632 68769 297 20 3 0 0 upowerd
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.575981] [ 1663] 112 1663 59394 655 16 3 0 0 colord
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.586232] [ 1707] 0 1707 3401 271 9 3 0 0 sshd
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.596414] [ 1815] 0 1815 6319 3406 15 3 0 0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.606700] [ 1842] 106 1842 94547 151 16 3 0 0 indicator-bluet
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.617810] [ 1966] 0 1966 5336 2664 13 3 0 0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.628156] [ 1999] 0 1999 5818 2900 14 3 0 0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.638567] [ 2020] 0 2020 5862 3147 15 3 0 0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.648958] [ 2044] 0 2044 6484 3631 16 3 0 0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.659264] [ 2059] 0 2059 5809 3066 15 3 0 0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.669532] [ 2080] 0 2080 5809 3065 14 3 0 0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.679796] [ 2095] 0 2095 5519 2846 13 3 0 0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.690061] [ 2105] 0 2105 5486 2763 13 3 0 0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.700460] [ 2111] 0 2111 5486 2809 14 3 0 0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.710728] [ 2117] 0 2117 7351 4598 17 3 0 0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.721088] [ 2126] 0 2126 8231 5257 18 3 0 0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.731393] [ 2127] 0 2127 61780 3650 20 4 0 0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.741658] [ 2131] 0 2131 61844 3657 20 4 0 0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.752026] [ 2136] 0 2136 6649 3803 15 3 0 0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.762373] [ 2137] 0 2137 6843 4102 15 3 0 0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.772630] [ 2138] 0 2138 6779 3917 15 3 0 0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.783030] [ 2139] 0 2139 6548 3643 15 3 0 0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.793407] [ 2140] 1000 2140 1665 177 7 3 0 0 systemd
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.803810] [ 2143] 1000 2143 2831 622 8 3 0 0 (sd-pam)
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.814249] [ 2244] 1000 2244 3401 275 9 3 0 0 sshd
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.824518] [ 2245] 1000 2245 2031 638 7 3 0 0 bash
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.834689] [ 2261] 0 2261 8231 5273 18 3 0 0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.844972] [ 2262] 0 2262 100391 5316 27 3 0 0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.855251] [ 2398] 0 2398 1289 33 5 3 0 0 agetty
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.865511] [ 2399] 0 2399 1335 32 5 3 0 0 agetty
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.875825] [ 2817] 0 2817 2514 150 8 3 0 0 cron
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.885993] [ 2818] 0 2818 2514 150 8 3 0 0 cron
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.896103] [ 2840] 0 2840 468 17 4 3 0 0 sh
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.906077] [ 2850] 0 2850 468 16 4 3 0 0 sh
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.916063] [ 2924] 0 2924 71969 581 24 5 0 -500 docker-containe
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.927160] [ 3535] 0 3535 1596547 317990 1521 9 0 0 stream_main
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.937911] [ 3697] 0 3697 2362794 173861 551 5 0 0 ubox_main
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.948452] [ 3964] 0 3964 137 1 3 3 0 0 sh
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.958336] [ 3965] 0 3965 331 1 3 3 0 0 bash
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.969662] [ 3966] 0 3966 331 1 3 3 0 0 bash
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.979818] Out of memory: Kill process 3535 (stream_main) score 303 or sacrifice child
Nov 30 16:53:50 tegra-ubuntu kernel: [ 122.989848] Killed process 3535 (stream_main) total-vm:6386188kB, anon-rss:1101432kB, file-rss:170528kB
I don’t know if watchdog has a flaw for out of memory, but lack of memory is what is causing the failure. Although watchdog should (in theory) work even if lock is due to no memory left it might be of use to find out what was going on to reach that shape.
I don’t know what “stream_main” is, although I see references to NUMA memory management. If the system locks up upon killing something critical to memory management I could easily see how that might bring down the system. I also see docker going, and wonder if the two are related. The “rss” is rather high, so it indicates this is a significant consumer of physical memory.
Is anything special going on with docker? Would anyone know if “stream_main” is related to use of GPU (I am assuming you had something using GPU at the time, but perhaps not)?
Dear linuxdev,
Thanks so much for your reply.
“stream_main” is video capturing process designed by our engineer, we use this process to the cpu&gpu overload, thus we can do the performance testing of TX1.
We found the problem that when the system is hanged and the watchdog should be launched, but it wasn’t launched, and this is unstable for our software system.
We will do some other tests to find a way to launch the watchdog.
Could you try below command to enable watchdog.
sudo su
echo 0 > /sys/kernel/debug/tegra_wdt/disable_wdt_reset
@Dennis
Below command was not for current release.
I have check the watchdog should be enable by default if system hang after 120s, Your system may not hang totally.
My step to have system dead.
sudo su
cd /proc/sys/kernel
echo 0 > panic
echo c> /proc/sysrq-trigger
Dear ShaneCCC,
Thanks for your information, we verified the watchdog trigger in kernel and user space. Good experience.