TX2 Watchdog Functionality

ian.bell87 · November 14, 2018, 6:30pm

Hi All,

A couple of questions on the watchdog functionality (L4T 28.2). First, I was trying to experiment with the watchdog. My understanding is running:

sudo tail -f /dev/watchdog

Should enable the watchdog and eventually it times out and the system restarts. However, when running this command (on the nvidia user) I get the following output:

tail: error reading '/dev/watchdog': Invalid argument
tail: '/dev/watchdog' has become accessible
tail: /dev/watchdog: cannot seek to offset 0: Illegal seek

Any thoughts on this?

Second question: We have installed several TX2 units in an industrial application. In general they are working well for extended periods of time, however, occasionally we find that they are simply off. Power to the unit is still there, but we cannot SSH, reach our local webserver and after a power cycle there are no local logs from the time they were off. Our theory is that the site power is not always consistent and at some point the unit browns-out. We are looking add adding a UPS to prevent this, but I am wondering if the watchdog can be used to help in this circumstance?

Cheers
Ian

linuxdev · November 14, 2018, 9:51pm

You will find some useful docs in the kernel source. Within the kernel, look for:

Documentation/watchdog/watchdog-api.txt

This provides a sample program, just copy this to the TX2 and build it:

Documentation/watchdog/src/*
# Compile:
make watchdog-simple
sudo ./watchdog-simple

There may be more restrictions or some changes between older 3.x kernels and the newer 4.x kernels, but I haven’t actually looked to see what/when the changes occurred.

ian.bell87 · November 16, 2018, 5:40pm

Thanks linuxdev, this works to let me restart via watchdog. Any thoughts on the second question above? Is there any mechanism in place to allow recovery from brown-out conditions (ie. conditions where the OS isn’t actually running, but there is power to the system).

Cheers

linuxdev · November 16, 2018, 6:51pm

The usual recipe for brownout is “don’t let it happen”. You could have custom hardware to monitor the line condition and act as an extension to the regular watchdog software. Perhaps you could load some area in memory with alternating patterns (e.g., 0xaa, then 0x55, then 0x00, then 0xff) and if the memory read back does not match a few seconds later, then consider it a reason to not trigger the watchdog “stop reset” (memory corruption is perhaps the most sensitive part of the system when it comes to brownouts). But if the system is locked, or parts of it crashed, then this wouldn’t help anyway (you’d need external hardware…the hardware would accept a heartbeat from the Jetson to avoid reboot).

If it is really critical, then you should consider external hardware to monitor heartbeat.

ian.bell87 · November 16, 2018, 8:21pm

Thanks again for the valuable input. Based on our conditions I think the path forward is adding a UPS (along the strategy of ‘don’t let it happen’) and also looking into an external hardware watchdog controlling a relay on the input power.

Cheers

Topic		Replies	Views
L4T R28.1 TX2 kernel crashes after enabling watchdog Jetson TX2	2	1105	October 18, 2021
Watch Dog Jetson TX2	6	2850	November 13, 2018
Watchdog Jetson TX2	12	2986	October 18, 2021
Use of TX2 Watchdog peripheral - role of WDT_TIME_OUT input Jetson TX2	17	3535	October 18, 2021
Jetson TK1 watchdog Jetson TK1	1	1506	October 30, 2015
Anout WatchDog(WDT) Jetson TX2 hw , ubuntu	2	836	October 18, 2021
How to enable watchdog in kernel space, and how can I confirm watchdog is enabled ? Jetson TX1	10	2226	October 18, 2021
[urgent]Is there a watchdog on jetson nano? Jetson Nano kernel	2	822	October 18, 2021
Jetson TX2 watchdog reboot on custom device Jetson TX2 board-design , power	5	825	October 18, 2021
Debugging TX2 4GB reliability/watchdog issues Jetson TX2 kernel	4	475	October 18, 2021

TX2 Watchdog Functionality

Related topics