Basics of Design CC Blog Research & Design Hub

Watchdogs in Embedded Linux

Written by Pedro Bertoleti

For More Robust Solutions

Embedded Linux continues to grow in popularity, with uses in every kind of application. Given its ubiquity, the operating system (OS) must be as robust as possible, able to reestablish itself following critical failures and malfunctions. This article discusses how to use watchdog timers to increase the robustness of embedded-Linux-based solutions.

  • What is the purpose of a watchdog timer?
  • How does a watchdog timer work?
  • How can I implement a watchdog timer in embedded Linux?
  • Raspberry Pi Zero W

Embedded Linux is becoming more and more widely deployed in electronic devices. It’s now used in almost any application you can think of—automotive multimedia systems, mobile phones, medical devices, huge servers, and household appliances, to name a few.

As many of us are all too aware, there are many situations—both predictable and not—that can cause an embedded system in the field to malfunction, including heat exposure, electromagnetic interference, severe mechanical shocks and vibrations, and just about everything else the real world has to offer. And of course, the embedded software itself on a device could contain critical bugs. Suffice it to say there’s a host of situations that could cause an embedded device to experience critical malfunctions that lead the operating system (OS) to freeze or crash.

Embedded devices must be able to recover from such malfunctions without human intervention. To achieve this self-sufficiency, a watchdog timer, which allows a device to reboot itself in case of a crash or freeze, becomes a must-have resource. In this article, I discuss how to use a watchdog to make embedded Linux-based solutions more robust.

WHAT ARE WATCHDOGS?

In simple terms, a watchdog is a hardware resource in an embedded system that can be described as a timer with the special capability of rebooting a microcontroller (MCU), System-on-a-Chip (SoC), or System-in-a-Package (SiP) when it times out. To avoid rebooting, the software that runs on an MCU/SoC/SiP must periodically restart this timer to its original value so it doesn’t time out. This timer restart action is called the “watchdog feed.” If the software crashes or freezes, the watchdog feed is interrupted and the watchdog times out, causing the device to reboot. If, on the other hand, everything is working as it should, the watchdog feed continues uninterrupted, the watchdog doesn’t time out, and the system doesn’t reboot needlessly.

The benefits are obvious. A watchdog adds significant robustness to an embedded system; a device with a watchdog can, ideally, recover when it crashes or freezes in the field. Without one, human intervention is needed when devices face these common malfunctions, resulting in significant downtime and high maintenance costs.

As dedicated hardware, watchdogs work separately from the embedded software, enhancing their reliability in determining when a system needs to be reset. Watchdogs are often represented as dedicated blocks in block diagrams, as seen in Figure 1.

Figure 1
PIC16F873A/876A block diagram, taken from the PIC16F87XA datasheet (Credit: https://ww1.microchip.com/downloads/en/devicedoc/39582b.pdf)
Figure 1
PIC16F873A/876A block diagram, taken from the PIC16F87XA datasheet (Credit: https://ww1.microchip.com/downloads/en/devicedoc/39582b.pdf)

It’s important to mention two items, here. First, the developer must choose wisely the watchdog timeout value. Second, watchdog feeds should be done in the embedded software. Usually, feeds happen in routines that are called frequently, such as a super-loop. When choosing the value of the watchdog timeout, the developer must have a firm understanding of how fast the embedded software runs. If it runs slow—perhaps because it needs responses from slow devices (like hard disks, some sensors, and any other high I/O wait operations), or because the CPU clock runs at a low speed to save power—the watchdog timeout value should be high. This means in the range of hundreds of milliseconds to seconds. If the embedded software runs fast, then lower timeout values—tenths to hundreds of milliseconds—are used.

HARDWARE NEEDED FOR EMBEDDED LINUX WATCHDOGS

To see a watchdog in action on embedded Linux, you’ll need the following:

  • single-board computer (SBC) that supports embedded Linux
  • micro-SD card to flash the SBC embedded Linux image.
  • power source to supply power to the SBC (a 5V/2A micro-USB power source, for example)

One of the most popular SBCs available is the Raspberry Pi Zero W board. It’s significantly cheaper than the full-sized Raspberry Pi boards, but it can run embedded Linux like them. Of course, if you only have a full-sized Raspberry Pi (Raspberry Pi 3 or Raspberry Pi 4, for example), you can still follow the examples described in this article—it works the same for all Raspberry Pi boards.

HANDLING AND ENABLING WATCHDOG IN EMBEDDED LINUX

To enable watchdog in the embedded Linux, two Linux kernel configs are available:

  • CONFIG_WATCHDOG: This must be set to Y to allow watchdog support in embedded Linux.
  • CONFIG_WATCHDOG_NOWAYOUT: This config is optional, but understanding its functionality is important when designing how the watchdog will work with embedded Linux. It’s set as N by default, which means the watchdog is turned off when the application that was feeding it closes the file /dev/watchdog (if the application or process ends, for example). If this config is set to Y, the watchdog keeps working (and expects feeds) even if the /dev/watchdog file is closed, forcing a reboot if the feed operation is interrupted for longer than the timeout value.

Once the Linux kernel’s compiled using these two configs, the watchdog is now supported by embedded Linux. The watchdog feed can be handled in embedded Linux by writing characters to the /dev/watchdog file in devtmpfs. Any characters can be written to this file to feed the watchdog, except “V”. Note that the watchdog feed operation only can be done as a super user. Here is an example line of code to feed the watchdog in embedded Linux:

sudo echo “1” > /dev/watchdog

The developer must also keep in mind:

  • The watchdog timeout configuration: This is configured via a device tree, since the watchdog is a hardware resource of the device running embedded Linux. This configuration is particular to each MCU/SoC/SiP. It can’t be changed in runtime in embedded Linux simply by writing to a devtmpfs file; you must modify the device tree and then recompile the Linux kernel to change it. In the case of the Raspberry Pi Zero W board, the instructions for its BCM2835 SoC state that watchdog time is defined in the timeout-sec property of the watchdog node in the device tree (Listing 1).
  • Where to feed the watchdog: The watchdog feed must be done in just one process in an embedded Linux environment. Therefore, developers should carefully choose which process will be responsible for feeding the watchdog on its embedded Linux solution.
  • When the watchdog starts: The watchdog only starts working after opening the /dev/watchdog file.
Listing 1

Watchdog node instructions for BCM2835, the SoC used in Raspberry Pi Zero W

BCM2835 Watchdog timer

Required properties:

- compatible : should be “brcm,bcm2835-pm-wdt”
- reg : Specifies base physical address and size of the registers.

Optional properties:

- timeout-sec  : Contains the watchdog timeout in seconds

Example:

watchdog {
	compatible = “brcm,bcm2835-pm-wdt”;
	reg = <0x7e100000 0x28>;
	timeout-sec = <10>;
};

EXPERIENCING WATCHDOG IN A RASPBERRY PI ZERO W

Finally, let’s watch a watchdog work in embedded Linux. If you’re using a Raspberry Pi Zero W board like I do in this article, check the following before proceeding:

  • The OS works: The micro-SD card already has the latest Raspberry Pi Zero W Linux embedded image flashed on it, and embedded Linux successfully executes when using this micro-SD card. I strongly recommend that you flash the image of Raspberry Pi OS to the micro-SD card, as this is the official Linux distro for Raspberry Pi Zero W (and other Raspberry Pi boards as well). For more details on how to flash a Raspberry Pi OS image to a micro-SD card, see the Raspberry Pi link on Circuit Cellar’s Article Materials and Resources web page [1].
  • The Pi’s local network access is ok: One of the easiest and most convenient ways to access the Raspberry Pi Zero W is by opening a terminal session over SSH in a local network. You can use any SSH terminal software, like MobaXterm or PuTTY. For more information, see Circuit Cellar’s Article Materials and Resources web page [2].

In the Raspberry Pi OS, the watchdog is supported by default and doesn’t require any changes in Linux kernel configs, so it doesn’t need a Linux kernel compilation. But if you want to explore the Linux kernel configs and tweak them, I recommend the Linux kernel guide provided by Rapsberry Pi [3].

Having checked off those items, you’re all set to check out the watchdog at work in your Raspberry Pi Zero W board. I’ll walk you through two examples in the remainder of this article. In the first, we’ll check that periodically feeding the watchdog prevents the board from rebooting. The second simulates a malfunction that interrupts the watchdog feed, causing Raspberry Pi Zero W to reset several seconds after the interruption. In both scenarios, a shell script will be used to automatically feed the watchdog every second, and print a message on the screen indicating that the watchdog has been fed. The source code for this shell script is shown in Listing 2.

Listing 2

Shell script source code

!/bin/bash
while true
do
echo “1” > /dev/watchdog
echo “Watchdog is fed”
sleep 1
done

Follow these steps to check that feeding the watchdog prevents the board from rebooting:

  • Go to the home folder with the command cd ~.
  • Create the shell script file by running the nano feed_watchdog.sh command.
  • Paste the source code of Listing 2 in nano text editor, then save and exit (use the following keyboard shortcuts for this: Ctrl + X, Y).
  • Give this shell script executable permission by using the chmod +x feed_watchdog.sh command.
  • Execute the feed_watchdog.sh shell script with the sudo ./feed_watchdog.sh command.
  • Observe the watchdog being fed (and that the “Watchdog is fed” message is printed) every second, preventing the Raspberry Pi Zero W from rebooting.

Easy enough! Let’s move on to the second test, in which we’ll simulate a malfunction that interrupts the watchdog feed operation. The malfunction closes the shell script we created to simulate a crash in the process that’s responsible for feeding the watchdog. This causes the Raspberry Pi Zero W board to reboot several seconds after the watchdog stops being fed. To do this:

  • Execute the feed_watchdog.sh shell script by using the sudo ./feed_watchdog.sh command.
  • Wait for a few seconds, then stop the feed_watchdog.sh shell script execution using the Ctrl + C keyboard shortcut. This will stop the watchdog feeding operations.
  • After several more seconds, the watchdog will timeout, and the Raspberry Pi Zero W board will reboot.
SOME SUGGESTIONS

I suggest that you use a dedicated process (started right after boot) to feed the watchdog. If the Linux distro you’re using contains systemd, you can create a service to execute a process (execution of a shell script, for example) right after the board boot that automatically feeds the watchdog periodically. Then, you don’t need to change one of your application source codes to feed the watchdog. Just be aware that the watchdog will then only cause the board to reboot if a critical failure happens to systemd (or to the whole Linux embedded system), or if the service stops for some reason. If you want to reboot the board in case of a critical failure in one of your applications, don’t use this approach.

For critical solutions that use embedded Linux as its OS, it’s a good practice to set CONFIG_WATCHDOG_NOWAYOUT to Y. This ensures the watchdog keeps working even if the process responsible for feeding it crashes or is accidentally closed.

In the case involving both scenarios—that is, a critical solution that runs on an embedded Linux distro containing systemd, and you use a systemd service to feed the watchdog—it’s good to use the RuntimeWatchdogSec configuration in the service file. This configuration defines the maximum time allowed between the embedded Linux boot and the first watchdog feed operation, in seconds. Then, in situations where embedded Linux faces initialization problems and gets slow (or unable) to boot, the embedded Linux board will automatically reboot.

However, be careful when using this. If RuntimeWatchdogSec is set to a low value (meaning insufficient time for embedded Linux to properly boot), the embedded Linux will be trapped in a boot-loop sequence, and will be unable to boot. So if you decide to use RuntimeWatchdogSec, set it to a time value that allows your embedded Linux to properly boot. As this time varies from solution to solution, you must observe how much time the embedded Linux takes to boot to determine a suitable value for RuntimeWatchdogSec.

CONCLUSION

I hope my breakdown of how to use a watchdog in embedded Linux helps you design more robust solutions. With a watchdog, your embedded Linux will be able to recover from crashes and freezes. This capability is a requirement for critical solutions, which makes a watchdog a must-have feature to prevent big downtimes and the need for human interventions (and related costs) to recover the device in the field. 

REFERENCES
[1] How to flash a Raspberry Pi OS image to a micro-SD card: https://www.raspberrypi.com/software/
[2] How to access Raspberry Pi via SSH: https://www.makeuseof.com/how-to-ssh-into-raspberry-pi-remote/
[3] Guide to correctly compiling and uploading kernel to Raspberry Pi: https://www.raspberrypi.com/documentation/computers/linux_kernel.html

SOURCES
Website:https://manpages.debian.org/jessie/systemd/systemd-system.conf.5.en.html#:~:text=ShutdownWatchdogSec%3D%20may%20be%20used%20to,clean%20reboot%20attempt%20times%20out
Website: https://linuxhint.com/linux-kernel-watchdog-explained/
Website: https://0pointer.de/blog/projects/watchdog.html
Website: https://cateee.net/lkddb/web-lkddb/WATCHDOG.html
Website: https://www.kernelconfig.io/config_watchdog
Website: https://www.kernel.org/doc/html/latest/watchdog/watchdog-parameters.html
Website: https://how-to.fandom.com/wiki/How_to_configure_the_Linux_kernel/drivers/char/watchdog
Website: https://www.linkedin.com/posts/cleitonbue no_linux-embarcado-mais-resiliente-activity-6892121346998882304-FVXA

RESOURCES
Embarcados | www.embarcados.com.br

PUBLISHED IN CIRCUIT CELLAR MAGAZINE • AUGUST 2023 #397 – Get a PDF of the issue

Keep up-to-date with our FREE Weekly Newsletter!

Don't miss out on upcoming issues of Circuit Cellar.


Note: We’ve made the Dec 2022 issue of Circuit Cellar available as a free sample issue. In it, you’ll find a rich variety of the kinds of articles and information that exemplify a typical issue of the current magazine.

Would you like to write for Circuit Cellar? We are always accepting articles/posts from the technical community. Get in touch with us and let's discuss your ideas.

Sponsor this Article

Supporting Companies

Upcoming Events


Copyright © KCK Media Corp.
All Rights Reserved

Copyright © 2024 KCK Media Corp.

Watchdogs in Embedded Linux

by Pedro Bertoleti time to read: 10 min