Present-day equipment relies on increasingly complex software, creating ever-greater demand for software quality and security. The two attributes, while similar in their effects, are different. A quality software is not necessarily secure, while a secure software is not necessarily of good quality. Safe software is both of high quality and security. That means the software does what it is supposed to do: it prevents hackers and other external causes from modifying it, and should it fail, it does so in a safe, predictable way. Software verification and validation (V&V) reduces issues attributable to defects, that is to poor quality, but does not currently address misbehavior caused by external effects.
Poor software quality can result in huge material losses, even life. Consider some notorious examples of the past. An F-22 Raptor flight control error caused the $150 million aircraft to be destroyed. An RAF Chinook engine controller fault caused the helicopter crash with 29 fatalities. A Therac radiotherapy machine gave patients massive radiation overdoses causing death of two people. A General Electric power grid monitoring system’s failure resulted in a 48-hour blackout across eight US states and one Canadian province. Toyota’s electronic throttle controller was said to be responsible for the lives of 89 people.
Clearly, software quality is paramount, yet too often it takes the back seat to the time to market and the development cost. One essential attribute of quality software is its traceability. This means that every requirement can be traced via documentation from the specification down to the particular line of code—and, vice versa, every line of code can be traced up to the specification. The documentation (not including testing and integration) process is illustrated in Figure 1.
The terminology is that of the DO-178 standard, which is mandatory for aerospace and military software. (Similarly, hardware development is guided by DO-254.) Other software standards may use a different terminology, but the intentions are the same. DO-178 guides its document-driven process, for which many tools are available to the designer. Once the hardware-software partitioning has been established, software requirements define the software architecture and the derived requirements. Derived requirements are those that the customer doesn’t include in the specification and might not even be aware of them. For instance, turning on an indicator light may take one sentence in the specification, but the decomposition of this simple task might lead to many derived requirements.
While requirements are being developed, test cases must be defined for each and every one of those requirements. Additionally, to increase the system safety, a so-called Safety-Instrumented Functions (SIF) should be considered. SIFs are monitors which cause the system to safely shut down if its performance fails to meet the previously defined safety limits. This is typically accomplished by redundancy in hardware, software or both. If you neglect to address such issues at an early development stage, you might end up with an unsafe system and having to redo a lot of work later.
Quality design is also a bureaucratic chore. Version control and configuration index must be maintained. The configuration index comprises the list of modules and their versions to be compiled for specific versions of the product under development. Without it, configuration can be lost and a great deal of development effort with it.
Configuration control and traceability are not just the best engineering practices. They should be mandated whenever software is being developed. Some developers believe that software qualification to a specific standard is required by the aerospace and military industries only. Worse, some commercial software developers still subscribe to the so-called iron triangle: “Get to market fast with all the features planned and high level of quality. But pick only two.”
Engineers in safety-critical industries (such as medical, nuclear, automotive, and manufacturing) work with methods similar to DO-178 to ensure their software performs as expected. Large original equipment manufacturers (OEMs) now demand adherence to software standards: IEC61508 for industrial controls, IEC62034 for medical equipment, ISO 26262 for automotive, and so forth. The reason is simple. Unqualified software can lead to costly product returns and expensive lawsuits.
Software qualification is highly labor intensive and very demanding in terms of resources, time, and money. Luckily, its cost has been coming down thanks to a plethora of automated tools now being offered. Those tools are not inexpensive, but they do pay for themselves quickly. Considering the risk of lawsuits, recalls, brand damage, and other associated costs of software failure, no company can really afford not to go through a qualification process.
As with hardware, quality must be built into the software, and this means following strict process rules. You can’t expect to test quality into the product at the end. Some companies have tried and the results have been the infamous failures noted above.
Testing embedded controllers often presents a challenge because you need the final hardware when it is not yet finished. Nevertheless, if you give testing due consideration as you prepare the software requirements, much can be accomplished by working in virtual or simulated environments. LDRA (www.ldra.com) is one great tool for this task.
Numerous methods exist for software testing. For example, dynamic code analysis examines the program during its execution, while the static analysis looks for vulnerabilities as well as programming errors. It has been shown mathematically that 100% test coverage is impossible to achieve. But even if it was, 35% to 40% of defects result from missing logic paths and another 40% from the execution of unique combinations of logic paths. Such defects wouldn’t get caught by testing, but can be mitigated by SIF.
Much embedded code is still developed in-house (see Figure 2). Is it possible for companies to improve programmers’ efficiency in this most labor-intensive task? Once again, the answer lies in automation. Nowadays, many tools come as complete suites providing various analyses, code coverage, coding standards compliance, requirements traceability, code visualization, and so forth. These tools are regularly seen at developers of avionic and military software, but they are not as frequently used by commercial developers because of their perceived high cost and steep learning curve.
With the growth of cloud computing and the Internet of Things (IoT), software security is gaining on an unprecedented importance. Some security measures can be incorporated in hardware while others are in software. Data encryption and password protection are the vital parts. Unfortunately, security continues to be not treated by some developers as seriously as it should be. Security experts warn that numerous IoT developers have failed to learn the lessons of the past and a “big IoT hack” in the near future is inevitable.
On a regular basis, the media report on security breaches (e.g., governmental organization hacks, bank hacks, and automobile hacks). What can be done to improve security?
There are several techniques—such as Common Weakness Enumeration (CWE)—that can help to improve our chances. However, securing software is likely a task a lot more daunting than achieving comprehensive V&V test coverage. One successful hack proves the security is weak. But how many unsuccessful hacks by test engineers are needed to establish that security is adequate? Eventually, a manager, probably relying on some statistics, will have to decide that enough effort has been spent and the software can be released. Different types of systems require different levels of security, but how is this to be determined? And what about the human factor? Not every test engineer has the necessary talent for code breaking.
History teaches us that no matter how good a lock, a cipher, or a password someone has eventually broken it. Several security developers in the past challenged the public to break their “unbreakable” code for a reward, only to see their code broken within hours. How responsible is it to keep sensitive data and systems access available in the cyberspace just because it may be convenient, inexpensive, or fashionable? Have the probability and the consequences of a potential breach been always duly considered?
I have used cloud-based tools, such as the excellent mbed, but would not dream of using them for a sensitive design. I don’t store data in the cloud, nor would I consider IoT for any system whose security was vital. I don’t believe cyberspace can provide sufficient security for many systems at this time. Ultimately, the responsibility for security is ours. We must judge whether the use IoT or the cloud for a given product would be responsible. At present, I see little evidence to be convinced the industry is adequately serious about security. It will surely improve with time, but until it does I am not about to take unnecessary risks.
George Novacek is a professional engineer with a degree in Cybernetics and Closed-Loop Control. Now retired, he was most recently president of a multinational manufacturer for embedded control systems for aerospace applications. George wrote 26 feature articles for Circuit Cellar between 1999 and 2004. Contact him at email@example.com with “Circuit Cellar”in the subject line.