Use Watchdog Timers (EE Tip #143)

Watchdog timers are essential to many complete electronic system designs.  As Bob Japenga explains, following a few guidelines will help make your designs more effective.

No longer used in just the realm of fault-tolerant systems, independent watchdog timers are put on systems because we know something can go wrong and prevent it from being fully functional. Sometimes the dogs reset the processor and sometimes they just safe the device and notify a user. However they work, they are an essential part of any system design. Here are the main guidelines we use:

  • Make it independent of the processor. The last thing you want is for the processor to lock up and the watchdog to lock up too.
  • Do not tickle the watchdog during an interrupt. (The interrupt can keep going while a critical thread is locked up.)
  • Do not tickle the watchdog until you are sure that all of your threads are running and stable.
  • Provide a way for your debugger to have break points without tripping the watchdog.
  • If the watchdog can be disabled, provide a mechanism for this to be detected.

I provide many more guidelines for watchdog design in a white paper that’s posted on our website.—Bob Japenga, CC25, 2013

Troubleshoot Electronics Problems with Logging (EE Tip #141)

Electrical engineers often develop “headless” electronic systems—that is, systems without user interfaces. And many of those systems are embedded within product and are generally out of reach when problems occur. Bob Japenga is an engineer with some advice about logging and how it can help you troubleshoot problems as they occur.

Many of our designs are buried in some product or located in remote areas. No one is there when the device hiccoughs. Well defined logging can help make your system more robust because there are going to be problems and you might as well find out as much as you can about the problems when they happen. Here are some general guidelines that we apply here:

• Use an existing logging facility if you can. It should have all of the features discussed here.

• Unless you truly have unlimited disk space, provide a self-pruning cap on all logs. Linux syslog feature has this built in.

• Attempt to provide the most amount of information in the least amount of space. One way we do this is by limiting the number of times the same error can be logged. I cannot tell you how many times I look at a log file and find the same error logged over and over again. Knowing your memory limitation and the frequency of the error, after a set number of identical logs, start logging only one in every 100, or only allow so many per hour, and include the total count. Some failures are best kept in error counters. For example, communications errors in a noisy environment should be periodically logged with a counter; you don’t usually need to know every occurrence.

• Create multiple logs concerning multiple areas. For example, network errors and communications errors are better kept in their own log apart from processing errors. This will help a single error from flooding all logs with its own problem.

• Timestamp all logs—ideally with date and time—but I understand that all of our systems don’t have date and time. As a minimum, it could be in milliseconds since power-up.

• Establish levels of logging. Some logging is only applicable during debugging. Build that into your logging.

• Avoid side effects. I hate it when the designer tells me that if he turns logging on, the system will come to its knees. That’s just a bad logging design.

• Make the logs easy to automatically parse.—Bob Japenga, CC25, 2013

Embedded Security (EE Tip #139)

Embedded security is one of the most important topics in our industry. You could build an amazing microcontroller-based design, but if it is vulnerable to attack, it could become useless or even a liability.  EmbeddSecurity

Virginia Tech professor Patrick Schaumont explains, “perfect embedded security cannot exist. Attackers have a wide variety of techniques at their disposal, ranging from analysis to reverse engineering. When attackers get their hands on your embedded system, it is only a matter of time and sufficient eyeballs before someone finds a flaw and exploits it.”

So, what can you do? In CC25, Patrick Schaumont provided some tips:

As design engineers, we should understand what can and what cannot be done. If we understand the risks, we can create designs that give the best possible protection at a given level of complexity. Think about the following four observations before you start designing an embedded security implementation.

First, you have to understand the threats that you are facing. If you don’t have a threat model, it makes no sense to design a protection—there’s no threat! A threat model for an embedded system will specify what can attacker can and cannot do. Can she probe components? Control the power supply? Control the inputs of the design? The more precisely you specify the threats, the more robust your defenses will be. Realize that perfect security does not exist, so it doesn’t make sense to try to achieve it. Instead, focus on the threats you are willing to deal with.

Second, make a distinction between what you trust and what you cannot trust. In terms of building protections, you only need to worry about what you don’t trust. The boundary between what you trust and what you don’t trust is suitably called the trust boundary. While trust boundaries were originally logical boundaries in software systems, they also have a physical meaning in embedded context. For example, let’s say that you define the trust boundary to be at the chip package level of a microcontroller.

This implies that you’re assuming an attacker will get as close to the chip as the package pins, but not closer. With such a trust boundary, your defenses should focus on off-chip communication. If there’s nothing or no one to trust, then you’re in trouble. It’s not possible to build a secure solution without trust.

Third, security has a cost. You cannot get it for free. Security has a cost in resources and energy. In a resource-limited embedded system, this means that security will always be in competition with other system features in terms of resources. And because security is typically designed to prevent bad things from happening rather than to enable good things, it may be a difficult trade-off. In feature-rich consumer devices, security may not be a feature for which a customer is willing to pay extra. The fourth observation, and maybe the most important one, is to realize is that you’re not alone. There are many things to learn from conferences, books, and magazines. Don’t invent your own security. Adapt standards and proven techniques. Learn about the experiences of other designers. The following examples are good starting points for learning about current concerns and issues in embedded security.

Security is a complex field with many different dimensions. I find it very helpful to have several reference works close by to help me navigate the steps of building any type of security service.

Schaumont suggested the following useful resources:

Don’t Trust Connectors, Solder, or Wires (EE Tip #138)

Engineer Robert Lacoste is one of our go-to resources for engineering tips and tricks. When we asked him for a few bits of general engineering advice, he responded with a list of more than 20 invaluable electrical engineering-related insights. One our team’s favorite “Lacoste tips” is this: don’t trust connectors, solder, or wires. Read on to learn more.

One of my colleagues used to say that 90% of design problems are linked either to power supplies or to connector-related issues. It’s often the case. Never trust a wire or a connector. If you don’t understand what’s going on, use your ohmmeter to check if the connections are as planned. (Do this even if you are sure they are.) A connector might have a broken pin, a wire might have an internal cut, a solder joint might be dry and not conductive, or you might simply have a faulty wiring scheme. (See the nearby photo.)

Using the wrong pinout for a connector is a common error, especially on RS-232 ports where it’s approximately 50% probable that you’ll have the wrong RX/TX mapping. Swapping the rows of a connector (as you see here) is also quite common.

Using the wrong pinout for a connector is a common error, especially on RS-232 ports where it’s approximately 50% probable that you’ll have the wrong RX/TX mapping. Swapping the rows of a connector (as you see here) is also quite common.

Another common error is to spend time on a nonworking prototype only to discover after a few hours that the prototype was working like a charm but the test cable was faulty. This should not be a surprise: test cables are used and stressed daily, so they’re bound to be damaged over time. This can be even more problematic with RF cables, which might seem perfect when checked with an ohmmeter but have degraded RF performance. As a general rule, if you find that a test cable shows signs of fatigue (e.g., it exhibits intermittent problems), just toss it out and buy a new one!—Robert Lacoste, CC25, 2013

 

Test Under Real Conditions (EE Tip #137)

The world’s best engineers have one thing in common: they’re always learning from their mistakes. We asked Niagara College professor and long-time contributor Mark Csele about his biggest engineering-related mistake. He responded with the following interesting insight about testing under real conditions.

Mark Csele's complete portable accelerometer design, which he presented in Circuit Cellar 266.  with the serial download adapter. The adapter is installed only when downloading data to a PC and mates with an eight pin connector on the PCB. The rear of the unit features three powerful rare-earth magnets that enable it to be attached to a vehicle.

Mark Csele’s complete portable accelerometer design, which he presented in Circuit Cellar 266. with the serial download adapter. The adapter is installed only when downloading data to a PC and mates with an eight pin connector on the PCB. The rear of the unit features three powerful
rare-earth magnets that enable it to be attached to a vehicle.

Trusting simulation (or, if you prefer, lack of testing under real conditions). I wrote the firmware for a large three-phase synchronous control system. The code performed amazingly well in the lab, and no matter what stimulus was applied, it always produced correct results. When put into operation in the field (at a very large industrial installation), it failed every 20 minutes or so, producing a massive (and dangerous) step-voltage output! I received a call from a panicked engineer on-site, and after an hour of diagnosis, I asked for a screenshot of the actual power line (which was said to be “noisy,” but we knew this ahead of time) only to be shocked at how noisy. Massive glitches appeared on the line many times larger than the AC peak and crossing zero several times, causing no end of problems. Many hours later (the middle of the morning), the software was fixed with a new algorithm that compensated for such “issues.” This was an incredibly humbling experience: I wasn’t nearly as smart as I had thought, and I really missed the boat on testing. I tested the system under what I thought were realistic conditions, whereas I really should have spent time investigating what the target grid really looked like.—Mark Csele, CC25 (anniversary issue)