How to enable watchdog in kernel space, and how can I confirm watchdog is enabled ?

Dear ,

How to enable watchdog in kernel space, and how can I confirm watchdog is enabled ?
I have read and followed the doc in <NVIDIA_Tegra_Linux_Driver_Package> -> -> Watchdog Timer, but I don’t sure the watchdog is enabled in kernel space?

I am using R28.2.1 (Jetpack 3.3), Kernel config list: (tegra18_defconfig)
CONFIG_WATCHDOG=y
CONFIG_WATCHDOG_NOWAYOUT=y
CONFIG_TEGRA_WATCHDOG=y

And now,
sudo echo 1 > /dev/watchdog0 (It’ will reboot after 120s)

Thank you.

Have a check below topic.
https://devtalk.nvidia.com/default/topic/891384/jetson-tk1/jetson-tk1-watchdog/post/4712102/#4712102
https://devtalk.nvidia.com/default/topic/975935

Dear ShaneCCC,

Thanks for your information, I have seen the topics, but I want to know how to confirm WDT0 is enabled in kernel space?
Can I confirm the kernel space WDT is enabled through the two:

  1. sudo echo 1 > /dev/watchdog0 (It’ will reboot after 120s)
  2. Kernel config list: (tegra18_defconfig)
CONFIG_WATCHDOG=y
CONFIG_WATCHDOG_NOWAYOUT=y
CONFIG_TEGRA_WATCHDOG=y

I haven’t looked at anything for watchdog inside of the kernel, but you might start with the kernel source file:

Documentation/ABI/testing/sysfs-class-watchdog

My thought is that if you know what you want from export to sysfs, then you can track that down within the actual watchdog code and see if there is some convenient way to hook into it (or just monitor a sysfs file).

Dear linuxdev & ShaneCCC,

Thanks for your replay, my problem is that the watchdog was not triggered even if the system is hanged(plug-in USB device, it can not be recogonized, tty can’t login).
Here is the kern.log when the system hanged because of out of memory, but in the mean time the watchdog wasn’t triggered to reboot Ubuntu.
Thanks fo much for your patient.

v 30 16:53:49 tegra-ubuntu kernel: [  121.869967] omxh264dec-omxh invoked oom-killer: gfp_mask=0x24201ca, order=0, oom_score_adj=0
Nov 30 16:53:49 tegra-ubuntu kernel: [  121.880273] omxh264dec-omxh cpuset=/ mems_allowed=0
Nov 30 16:53:49 tegra-ubuntu kernel: [  121.886626] CPU: 0 PID: 3793 Comm: omxh264dec-omxh Not tainted 4.4.38-tegra #1
Nov 30 16:53:49 tegra-ubuntu kernel: [  121.895634] Hardware name: jetson_tx1 (DT)
Nov 30 16:53:49 tegra-ubuntu kernel: [  121.900634] Call trace:
Nov 30 16:53:49 tegra-ubuntu kernel: [  121.903973] [<ffffffc000088fc8>] dump_backtrace+0x0/0xf4
Nov 30 16:53:49 tegra-ubuntu kernel: [  121.910200] [<ffffffc0000890d0>] show_stack+0x14/0x1c
Nov 30 16:53:49 tegra-ubuntu kernel: [  121.916162] [<ffffffc0003769ac>] dump_stack+0xac/0xe4
Nov 30 16:53:49 tegra-ubuntu kernel: [  121.922110] [<ffffffc0001c8460>] dump_header.isra.7+0x60/0x1a4
Nov 30 16:53:49 tegra-ubuntu kernel: [  121.928835] [<ffffffc00016fe24>] oom_kill_process+0x94/0x434
Nov 30 16:53:49 tegra-ubuntu kernel: [  121.935378] [<ffffffc0001704dc>] out_of_memory+0x298/0x2e0
Nov 30 16:53:49 tegra-ubuntu kernel: [  121.941739] [<ffffffc00017517c>] __alloc_pages_nodemask+0xa64/0xa8c
Nov 30 16:53:49 tegra-ubuntu kernel: [  121.948880] [<ffffffc00016ec38>] filemap_fault+0x308/0x47c
Nov 30 16:53:49 tegra-ubuntu kernel: [  121.955236] [<ffffffc00024af18>] ext4_filemap_fault+0x34/0x50
Nov 30 16:53:49 tegra-ubuntu kernel: [  121.961848] [<ffffffc000196adc>] __do_fault+0x3c/0xa4
Nov 30 16:53:49 tegra-ubuntu kernel: [  121.967755] [<ffffffc00019a5bc>] handle_mm_fault+0x764/0x154c
Nov 30 16:53:49 tegra-ubuntu kernel: [  121.974359] [<ffffffc00009912c>] do_page_fault+0x1a8/0x454
Nov 30 16:53:49 tegra-ubuntu kernel: [  121.980699] [<ffffffc000080b98>] do_mem_abort+0x40/0x9c
Nov 30 16:53:49 tegra-ubuntu kernel: [  121.986775] [<ffffffc000080c68>] do_el0_ia_bp_hardening+0x74/0x7c
Nov 30 16:53:49 tegra-ubuntu kernel: [  121.993722] [<ffffffc000084478>] el0_ia+0x18/0x1c
Nov 30 16:53:49 tegra-ubuntu kernel: [  121.999778] Mem-Info:
Nov 30 16:53:49 tegra-ubuntu kernel: [  122.002962] active_anon:475879 inactive_anon:33311 isolated_anon:0
Nov 30 16:53:49 tegra-ubuntu kernel: [  122.002962]  active_file:2120 inactive_file:2908 isolated_file:32
Nov 30 16:53:49 tegra-ubuntu kernel: [  122.002962]  unevictable:0 dirty:0 writeback:0 unstable:0
Nov 30 16:53:49 tegra-ubuntu kernel: [  122.002962]  slab_reclaimable:5119 slab_unreclaimable:10093
Nov 30 16:53:49 tegra-ubuntu kernel: [  122.002962]  mapped:31973 shmem:33650 pagetables:2877 bounce:0
Nov 30 16:53:49 tegra-ubuntu kernel: [  122.002962]  free:4186 free_pcp:337 free_cma:2881
Nov 30 16:53:49 tegra-ubuntu kernel: [  122.042006] DMA free:13448kB min:4068kB low:5084kB high:6100kB active_anon:977948kB inactive_anon:65980kB active_file:8136kB inactive_file:9532kB unevictable:0kB isolated(anon):0kB isolated(file):128kB present:2076672kB managed:2055988kB mlocked:0kB dirty:0kB writeback:0kB mapped:65056kB shmem:66560kB slab_reclaimable:7344kB slab_unreclaimable:14108kB kernel_stack:3232kB pagetables:5668kB unstable:0kB bounce:0kB free_pcp:1064kB local_pcp:120kB free_cma:11524kB writeback_tmp:0kB pages_scanned:106708 all_unreclaimable? yes
Nov 30 16:53:49 tegra-ubuntu kernel: [  122.094337] lowmem_reserve[]: 0 1976 1976
Nov 30 16:53:49 tegra-ubuntu kernel: [  122.099465] Normal free:3792kB min:4004kB low:5004kB high:6004kB active_anon:925568kB inactive_anon:67264kB active_file:88kB inactive_file:1824kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2097152kB managed:2023448kB mlocked:0kB dirty:0kB writeback:0kB mapped:62836kB shmem:68040kB slab_reclaimable:13132kB slab_unreclaimable:26264kB kernel_stack:4992kB pagetables:5840kB unstable:0kB bounce:0kB free_pcp:372kB local_pcp:236kB free_cma:0kB writeback_tmp:0kB pages_scanned:29780 all_unreclaimable? yes
Nov 30 16:53:49 tegra-ubuntu kernel: [  122.153539] lowmem_reserve[]: 0 0 0
Nov 30 16:53:49 tegra-ubuntu kernel: [  122.158150] DMA: 264*4kB (UMEC) 156*8kB (UMC) 14*16kB (UMC) 3*32kB (C) 13*64kB (C) 14*128kB (C) 9*256kB (C) 4*512kB (C) 4*1024kB (C) 0*2048kB 0*4096kB = 13696kB
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.176370] Normal: 770*4kB (UME) 89*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3792kB
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.190734] 38084 total pagecache pages
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.195747] 0 pages in swap cache
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.200282] Swap cache stats: add 0, delete 0, find 0/0
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.206714] Free swap  = 0kB
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.210809] Total swap = 0kB
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.214849] 1043456 pages RAM
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.218942] 0 pages HighMem/MovableOnly
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.224014] 23597 pages reserved
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.228370] 4096 pages cma reserved
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.232927] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.243622] [  239]     0   239     5191     1332       9       3        0             0 systemd-journal
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.255196] [  249]     0   249    19420       42       5       3        0             0 lvmetad
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.265959] [  282]     0   282     3090      337       8       3        0         -1000 systemd-udevd
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.277283] [  410]   100   410    19326       61       6       3        0             0 systemd-timesyn
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.288626] [  430]     0   430    77569      388      18       4        0             0 ModemManager
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.301033] [  432]     0   432      985      100       5       3        0             0 systemd-logind
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.312237] [  434]   110   434     1484      114       6       3        0             0 avahi-daemon
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.323119] [  439]     0   439      441       20       4       3        0             0 runsvdir
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.333582] [  445]   105   445     1660      282       6       3        0          -900 dbus-daemon
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.344258] [  456]   110   456     1454       79       6       3        0             0 avahi-daemon
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.355025] [  499]     0   499     1601       68       6       3        0             0 cron
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.365097] [  507]     0   507     3914      286      11       3        0             0 cupsd
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.375289] [  510]     0   510    41651      309      15       3        0             0 cups-browsed
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.386157] [  512]   108   512    56463      221      10       3        0             0 rsyslogd
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.396598] [  547]   121   547     1702       85       6       3        0             0 gpsd
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.406819] [  549]     0   549     1965      199       7       3        0             0 ofonod
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.417064] [  568]     0   568    58226      282      15       3        0             0 accounts-daemon
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.428131] [  590]     7   590     2905      184       8       3        0             0 dbus
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.438235] [  982]   111   982     2456       95       7       3        0             0 dnsmasq
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.448525] [  995]     0   995    57844      521      14       3        0             0 polkitd
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.458900] [ 1173]   106  1173     1632      173       6       3        0             0 systemd
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.469266] [ 1175]   106  1175     2800      608       8       3        0             0 (sd-pam)
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.479629] [ 1203]   106  1203    21398      108       9       3        0             0 gnome-keyring-d
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.490623] [ 1499]     0  1499     2441      165       9       3        0         -1000 sshd
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.501749] [ 1501]     0  1501   133819     3063      50       5        0          -500 dockerd
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.512116] [ 1503]     0  1503     3073      213       9       3        0             0 argus_daemon
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.522905] [ 1514]     0  1514     8781      448      19       3        0             0 nvcamera-daemon
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.534041] [ 1516]   107  1516    61267      388      20       3        0             0 whoopsie
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.544466] [ 1550]   106  1550     2253      158       7       3        0             0 upstart
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.554789] [ 1620]   116  1620    38141       71       9       3        0             0 rtkit-daemon
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.565638] [ 1632]     0  1632    68769      297      20       3        0             0 upowerd
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.575981] [ 1663]   112  1663    59394      655      16       3        0             0 colord
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.586232] [ 1707]     0  1707     3401      271       9       3        0             0 sshd
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.596414] [ 1815]     0  1815     6319     3406      15       3        0             0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.606700] [ 1842]   106  1842    94547      151      16       3        0             0 indicator-bluet
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.617810] [ 1966]     0  1966     5336     2664      13       3        0             0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.628156] [ 1999]     0  1999     5818     2900      14       3        0             0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.638567] [ 2020]     0  2020     5862     3147      15       3        0             0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.648958] [ 2044]     0  2044     6484     3631      16       3        0             0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.659264] [ 2059]     0  2059     5809     3066      15       3        0             0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.669532] [ 2080]     0  2080     5809     3065      14       3        0             0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.679796] [ 2095]     0  2095     5519     2846      13       3        0             0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.690061] [ 2105]     0  2105     5486     2763      13       3        0             0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.700460] [ 2111]     0  2111     5486     2809      14       3        0             0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.710728] [ 2117]     0  2117     7351     4598      17       3        0             0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.721088] [ 2126]     0  2126     8231     5257      18       3        0             0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.731393] [ 2127]     0  2127    61780     3650      20       4        0             0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.741658] [ 2131]     0  2131    61844     3657      20       4        0             0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.752026] [ 2136]     0  2136     6649     3803      15       3        0             0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.762373] [ 2137]     0  2137     6843     4102      15       3        0             0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.772630] [ 2138]     0  2138     6779     3917      15       3        0             0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.783030] [ 2139]     0  2139     6548     3643      15       3        0             0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.793407] [ 2140]  1000  2140     1665      177       7       3        0             0 systemd
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.803810] [ 2143]  1000  2143     2831      622       8       3        0             0 (sd-pam)
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.814249] [ 2244]  1000  2244     3401      275       9       3        0             0 sshd
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.824518] [ 2245]  1000  2245     2031      638       7       3        0             0 bash
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.834689] [ 2261]     0  2261     8231     5273      18       3        0             0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.844972] [ 2262]     0  2262   100391     5316      27       3        0             0 python
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.855251] [ 2398]     0  2398     1289       33       5       3        0             0 agetty
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.865511] [ 2399]     0  2399     1335       32       5       3        0             0 agetty
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.875825] [ 2817]     0  2817     2514      150       8       3        0             0 cron
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.885993] [ 2818]     0  2818     2514      150       8       3        0             0 cron
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.896103] [ 2840]     0  2840      468       17       4       3        0             0 sh
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.906077] [ 2850]     0  2850      468       16       4       3        0             0 sh
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.916063] [ 2924]     0  2924    71969      581      24       5        0          -500 docker-containe
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.927160] [ 3535]     0  3535  1596547   317990    1521       9        0             0 stream_main
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.937911] [ 3697]     0  3697  2362794   173861     551       5        0             0 ubox_main
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.948452] [ 3964]     0  3964      137        1       3       3        0             0 sh
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.958336] [ 3965]     0  3965      331        1       3       3        0             0 bash
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.969662] [ 3966]     0  3966      331        1       3       3        0             0 bash
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.979818] Out of memory: Kill process 3535 (stream_main) score 303 or sacrifice child
Nov 30 16:53:50 tegra-ubuntu kernel: [  122.989848] Killed process 3535 (stream_main) total-vm:6386188kB, anon-rss:1101432kB, file-rss:170528kB

I don’t know if watchdog has a flaw for out of memory, but lack of memory is what is causing the failure. Although watchdog should (in theory) work even if lock is due to no memory left it might be of use to find out what was going on to reach that shape.

I don’t know what “stream_main” is, although I see references to NUMA memory management. If the system locks up upon killing something critical to memory management I could easily see how that might bring down the system. I also see docker going, and wonder if the two are related. The “rss” is rather high, so it indicates this is a significant consumer of physical memory.

Is anything special going on with docker? Would anyone know if “stream_main” is related to use of GPU (I am assuming you had something using GPU at the time, but perhaps not)?

Dear linuxdev,

Thanks so much for your reply.
“stream_main” is video capturing process designed by our engineer, we use this process to the cpu&gpu overload, thus we can do the performance testing of TX1.
We found the problem that when the system is hanged and the watchdog should be launched, but it wasn’t launched, and this is unstable for our software system.
We will do some other tests to find a way to launch the watchdog.

Could you try below command to enable watchdog.

sudo su
echo 0 > /sys/kernel/debug/tegra_wdt/disable_wdt_reset

@Dennis
Below command was not for current release.
I have check the watchdog should be enable by default if system hang after 120s, Your system may not hang totally.

My step to have system dead.

sudo su
cd /proc/sys/kernel
echo 0 > panic
echo c> /proc/sysrq-trigger

Dear ShaneCCC,

Thanks for your information, we verified the watchdog trigger in kernel and user space. Good experience.