Testing doesn’t have to be torture! In this article, I show how to select the right tool for this essential job. The right testing tools lead to improved customer satisfaction, and a real payoff on the bottom line.
Introduction: The Search for an Honest Man
Test and measurement tools are the lifeblood of any well-stocked engineering laboratory. As one wise friend of mine told me a long time ago, “Tools don’t cost money. Tools make money.”
The gist of this Zen-koan-like contradiction is that tools, properly applied, save time. And time is not only money—it’s also not having to explain to your boss why the project Gantt chart looks more like an asymptote stretching towards an uncertain infinity rather than a convergence toward marketing’s drop-dead release date.
In any case, the broad category of test and measurement covers everything from the ubiquitous calibrated hammer which resides in every engineer’s top left-hand drawer, to the more rarified high-speed digital storage oscilloscopes (DSOs), fancy in-circuit emulators, sophisticated SPICE simulators, and workhorse integrated development environments (IDEs).
Often overlooked, however, are instruments designed not to develop systems and their components, but to test and prove out the functionality of the completed system as a whole. Broadly known as automated test tools, these can range from simple batch files that exercise a specific functionality of a system—a sort of “go/no go” type test—to tools which are compiled into the system code and can inspect machine states and variables in real time.
The applicability of these tools varies (the notable exception being the calibrated hammer), and they are not one-size-fits-all. In this article, I will detail the selection process that my current company went through to choose an automated test platform. Along the way, we’ll discuss the general use cases for automated test tools, and some of the dos and don’ts of automating your embedded test process.
First, I’ve been remiss in introducing myself and my relationship to this topic. I am an EE/CS with far too many (almost forty) years of experience in embedded systems development—primarily datacom, telecom, and related industries. I’m currently employed as a consultant working at Rudolph Research Analytical, a manufacturer of precision laboratory equipment which, coincidentally, falls under the category of test and measurement for an unrelated industry. In this case the instruments developed by RRA are designed to serve the test and measurement needs of various chemical and materials laboratories.
Rudolph has been operating in this sector for over fifty years, with three main product lines—polarimeters to measure the polarization of light, density meters to measure, ah, density, and refractometers to measure the refractive index of liquids, compacted powders, and clear solids. Rudolph’s instruments are primarily laboratory bench-top type units, and are sold to the pharmaceutical industry, the flavor and fragrance industry, the industrial chemical and food safety industries, and the alcohol and sugar processing industries.
I work in the software test group for Rudolph, and suffice it to say, if our instruments do not provide a precise, accurate and repeatable measurement of the physical properties of the materials they are testing, we hear about it pretty quickly. Rudolph has a global presence in these various industries, and our equipment is used by the Pfizers and the Dows and the Coca-Colas of the world.
Our instruments are all based on embedded firmware that talks to a custom-built version of Windows Embedded 7. Soon, due to the EOL notice from Microsoft for support of Windows Embedded 7, we will be migrating all our UI platforms over to Windows 10/11 IoT.
Our devices provide a touchscreen user interface to allow operators to perform measurements, check the calibration of our instruments, and display and review data and results on the embedded platform itself. We also provide an industry-specific mode of operation that is compliant with the data handling and integrity requirements of FDA 21 CFR Part 11 , regarding electronic records handling, as well as general Good Laboratory Practices, GLP, FDA 21 CFR Part 58 .
These standards require our instruments to provide certain metrics and modes of operation that allow a reasonable user to comply and pass an FDA inspection of their facilities. The penalties to the end user can be significant if they are found to be non-compliant, and all instrumentation used in end-user labs must be able to show the ability to conform to the standards and give confidence that measurement data is being secured for as long as it resides within the control of the measuring device. That said, the charter of our internal software quality process is to ensure that each new release of the UI is fully tested and passes prior to it getting to the production floor or the field.
When I was hired by RRA in the early part of 2017, I was tasked with implementing a test regimen for all of Rudolph’s products. The goals were to have a test and release process, which at the time was lacking, and to institute both manual and automated testing as well as regression testing. As a side product, the engineering department was also looking to be able to automate stress testing of the design, and lastly there was a desire to gather performance metrics on the UI.
The manual testing portion of my mission was accomplished through the development of a formal test process. We created an Entry Gate checklist to place a release into test, and a similar checklist to exit. Finally, a revision-controlled release was done via a test review meeting with all stakeholders. This same process was then to be augmented with the automated test. The challenge fell to me and my team to select the tool that we would use to perform the automated test portion of our testing.
Evaluation and Selection
Our initial search turned up many candidates for automated test tool suites. After a winnowing process we decided to get evaluation copies of three:
- TestComplete from Smart Bear Software 
- Squish from Frog Logic GmbH
- EggPlant from TestPlant Ltd, now offered by Keysight Inc.
As you can see, automated testing software comes complete with funny names and cute logos, but we were more concerned with the following:
- Footprint—For most embedded systems, the performance of the system and the space and resources required for the test utility were critical. We needed a test environment that was light and resource efficient.
- Ease of use—The rapid development of effective tests that integrate well with our embedded environment was a crucial requirement. We also wanted a system that was easy to setup and maintain.
- Multi-faceted—We wanted a single tool suite that would support our need for automated testing, regression testing, and stress testing, without a lot of configuration changes. Of these, the automated testing was the most critical to speed up our manual test and perform tedious repetitive testing in a way that would offload human testers.
- Support—With limited resources and tight time constraints, we wanted our partner to provide proactive and knowledgeable support, and to have the ability to get in touch with the right expert while negotiating in the fewest number of gatekeeper barrier steps.
- Flexibility—We wanted a “Swiss Army knife” type tool suite that allowed us to interface with a variety of platforms. Our UI can run on either the embedded device or a standard PC, and we wanted to ensure that we would not be stuck with a tool that would not be as flexible as the application with which it was interfacing.
- Scalable—Our initial requirement was relatively small. We wanted to be able to run one system fully under test (multiple embedded systems simultaneously) and have another free for script development. But we wanted a partner that could scale and grow with us as our testing needs might change.
- Cost—We wanted a low-cost system, but not one that was too inflexible or with a steep upgrade cost. We decided to look at TCO from the point of view of value provided as a function of efficiency, rather than solely initial cost plus maintenance/licensing. Our management was willing to invest in a tool suite that would grow with us and give us the ability to leverage our testing headcount. We aimed to do “more with less,” to quote the oft-heard phrase.
Ideally, we wanted a system that would be able to kick off a build, load the application onto a unit under test (UUT), and then call the automated test to run a suite of tests and produce a report which is emailed to developers and stakeholders on a nightly basis.
As we will see, none of the solutions met this lofty goal. But in the end, we selected the partner that met most of our needs.
SmartBear—TestComplete: We received an evaluation license for TestComplete first. They were responsive to our needs and assigned a team member to be our direct contact to their support for all questions.
One of our concerns was that TestComplete required a compiled-in component that we would have to add to our build environment. To our thinking, this was a no-no for most embedded systems, as the UUT build would no longer be the same as the release build.
The second flag was that TestExecute, the local interface with the testing suite, also had to be installed on the UUT. This additional utility was function-heavy, bringing the ability to craft scripts and tests using a version of Python with extensions that were customized for TestComplete. However, this again required hooks in our code, and the need to develop Python scripts on the UUT or a similar environment.
To be fair, TestComplete also had the ability to develop and run these scripts on a remote PC. However, the normal test environment seemed to be relying on the UUT for many resources. Our systems are “built-for-purpose” and do not have a lot of extra horsepower, so this was a significant drawback.
The last issue we ran into was in installing TestExecute on our target UUT (Figure 1). We found that TestExecute was relying on components of the Windows OS that were not part of our custom Windows Embedded 7 build (Figure 2). Adding these components back made the system even further from our production configuration.
FrogLogic—Squish: FrogLogic is a German company, and they too were very helpful in getting us an evaluation license—although they only provided it for a ten-day evaluation window. By contrast, both SmartBear and TestPlant gave us thirty days and extended this to forty-five days on request.
With our limited testing time, we found that Squish was similar in many ways to TestComplete. It required a compiled-in component, and used an object-based testing model, which allowed for internal inspection of the code states and variables.
But again, the main drawback we had with Squish was the requirement for a compiled-in component, which would make our test UI a different build from our release UI. We also had some similar issues installing the local UUT components of Squish on to our Windows Embedded 7. This was again because we had stripped out many Windows components in order to save space.
TestPlant (Keysight)—Eggplant: The first thing we saw when evaluating Eggplant was that the test philosophy was image- and event-based. The difference here was that Eggplant did not require any compiled-in component—it would run with our current UI without any modifications. It was able to do this because Eggplant replied on screen-scraping technology, image recognition, optical character recognition (OCR), and keyboard/mouse events to drive and verify the UUT operation.
At first our team though that this approach might be slow, and that the ability of the system to properly detect events and perform test cases would be limited by the precision with which it could detect software states from the observed changes on the screen. In addition, early on we realized that developing scripts would require us to capture images from the UI screen itself and provide these images to the TestPlant IDE as a way of parsing and matching to select various screen events. We also worried that the limitation of OCR and keyboard/mouse events might not give us the flexibility to test all the features of our systems.
We started evaluating the Eggplant environment on our system, and we were pleasantly surprised. First, we were not only assigned a sales contact, but we were also given full access to the US-based (Colorado) support team for any questions and issues that might arise. We immediately started to use the TestPlant IDE and found that it was easy to install on our PC platform, and that all tests were to be developed remotely and run on our PC.
The interface to the UUT was similarly slick (Figure 3). Eggplant had very smartly leveraged a tool called Virtual Network Computing (VNC) as their interface with the UUT. For those that are not familiar, VNC is a remote desktop-type tool. It has a client-server model of operation, but any client can also be a server. In practice, VNC is installed on the UUT and on the development PC. Eggplant IDE comes with a version of VNC built into their installation (Figure 4).
The scripting language for Eggplant was another challenge. TestPlant was founded by the developers of SenseTalk, an early scripting language based on natural language processing (NLP). SenseTalk was therefore the language supported in the TestPlant IDE for the development of all the scripts for testing, as well as the reports and the UI interfaces. Our team was leery of having to learn a new language. But two of us—myself and one of my colleagues, Sridhar Rajamani—began to read through the SenseTalk documentation to evaluate how difficult it would be to learn.
Once again, we were pleasantly surprised. SenseTalk was very cleanly laid out, though the documentation was initially difficult as it was all online and set up as a sort of wiki. Once we understood how to access it from within the IDE via context-specific help, it was easy to use and learn (Figure 5).
The language itself is strongly NLP-based. It allows constructs like, “Repeat with each item currentScript of scriptList,” which will automatically parse the list and iterate through the repeat (which is their way of constructing a loop) (Figure 6). Both of us were able to quickly grasp the scripting language and found it intuitive and logical (Figure 7).
We did have to contact TestPlant support when trying to interface with our instrument, however. We had set up a simple script as a demo to try to determine the state of the UI based on the icons displayed on the main screen. We thought it looked correct, but it kept failing. Once we contacted support, they agreed with us that our script logic was correct, and at first they could not determine what was causing the failure. It took a third-level support person, Carrie Graf from their Colorado team, to show us that the screen-scraping and image recognition capabilities of their system were so precise that it found an unnoticed bug in our UI software. Icons were being rendered in slightly different ways that were almost imperceptible to the eye, and had been for years without us noticing. Eggplant’s software not only noticed, but it also properly flagged the image as not matching and failed the test case (Figure 8).
This may have been the tipping point for us with Eggplant. After a few more days of scripting and checking, we chose this tool for our automated testing needs.
Use Cases: SenseTalk Scripting
The SenseTalk script language was easy to learn and use. The Eggplant IDE made its use simple, with autocomplete and context-specific suggestions for the proper primitives while typing lines of code, a built-in part of the IDE feature set. While we still had a bit of a learning curve, we were able to create effective scripts in short order.
Let’s now examine some of our SenseTalk scripts as use cases for automated test reports, stress testing and regression testing in our embedded environment.
Use cases—automated test reports: In Figure 9 we have an example of our automated test suite. We designed the system to be data-driven, in the sense that we can use a single Excel file to set the top-level parameters we want to test. The Excel file holds information such as the unit ID, its IP address, its mode, and passwords, as well as the version and level of test that we want to run.
Figure 10 is a typical Excel file. You can see that each column represents an addressable item, and that each row represents a specific UUT. In this way we’re able to stack up UUTs to test and run various models and settings for each one. When the top-level script is invoked, it asks for the Excel file name and then imports the information to configure the test. SenseTalk then nicely parses it and makes each field a scriptable, addressable portion of an object. In this context, the phrase “for each item of THINGY, do this THING” works perfectly well as an instruction. Simple!
Use cases—stress testing: Using Eggplant to stress test has been both easy and incredibly effective. In many cases embedded UI have issues which only occur randomly, or under certain coincidental conditions, or over time. But finding that condition reported from the field and then testing to see if it’s fixed requires repeatedly executing UI operations. Plus, it’s unrealistic to expect someone to hit a button a thousand times and note the one time that a specific issue occurs. Using Eggplant and SenseTalk scripts, we can have an automatic “finger” that tirelessly presses the button, and automatic “eyes” that continually look for the one-in-a-thousand error.
This same technique is then used to prove out any fix, and it eventually may be incorporated into a regression test for the unit. In this way SenseTalk scripting has improved the overall reliability of our UI and measurement systems in a significant fashion.
Use cases—regression testing: Last, we set up formal regression testing, again with a data-driven approach. To accomplish this, we created a naming convention. Lower-level scripts use a file name that contains the revision of software and the specific model that is being tested. SenseTalk can easily parse this name into fields, and we will identify specific tests as eligible to run based on the matching of the script name. Older tests will also be run if the major revision number is matched and the minor number is equal to or less than the current revision.
Our script name convention is [TEST]-[MODELS]-[REVISION (Series.Major.Minor.RelType)]-[ScriptName].script (Figure 11). We parse these file names to determine if this script needs to run, based on the model and the software version data in the column for this unit. So, for a regression test that was run with the model set to AP6 (our polarimeter) and software revision 188.8.131.52, we would sweep up the test cases and run any that were in the 4.0.x.x set. For a model where we wanted to test a branch release—184.108.40.2063 for instance—we would sweep up any tests in the 220.127.116.11 or .1002 designators. This was easy with SenseTalk’s ability to parse. We could set up “boilerplate” tests, like checking the UI version that would run, in addition to a new bug or functionality test from a later revision.
Figure 12 is an example of this. The SenseTalk variables that are used are either passed in at invocation time, or they are global in scope for things like Mode or passwords that do not change per session. You will note that we have designed “helper” scripts that always do one thing. The Navigate script is a good example. It takes two arguments, the button you want to find, and a value of true/false, where true means the requested button will require a password to navigate this portion of the menu tree.
At the conclusion of each boilerplate script, the pass/fail status is logged, and the final report is updated. Ultimately, the complete report, with the status of each test run, is emailed to anyone who wishes to be notified of the results of the overnight testing (Figure 13). Neat!
There is no such thing as a perfect tool, and Eggplant is no exception. We found that, especially in the beginning, we struggled with how to implement effective tests. The incorporation of the icons into the Eggplant IDE requires an investment of time, and the scripts do need to be kept current as menu items are changed or functionality is added or modified. All that said, we are pleased with the positive impact that Eggplant has had on our testing efficiency, UI reliability and overall customer satisfaction.
We have seen year-over-year improvements in measurable quality metrics, fewer bugs reported from the field, and fewer downstream issues, which reduces costs. We’ve also been able to keep our testing department lighter, hiring and retaining a smaller number of senior personnel than the number of low-level button pushers we would otherwise need. This is a side benefit of Eggplant, but a desirable one from an organizational and group morale perspective.
Keysight Eggplant worked well for us and our embedded test needs. Your mileage may, as always, vary.
 FDA 21 CFR Part 11:
 FDA 21 CFR Part 58:
PUBLISHED IN CIRCUIT CELLAR MAGAZINE • DECEMBER 2022 #389 – Get a PDF of the issueSponsor this Article
Michael Lynes is an entrepreneur who has founded several startup ventures. He was awarded a BSEE degree in Electrical Engineering from Stevens Institute of Technology and currently works as an embedded software engineer. When not occupied with arcane engineering projects, he spends his time playing with his three grandchildren, baking bread, working on ancient cars, backyard birdwatching, and taking amateur photographs. He’s also a prolific author with over thirty works in print. His latest series is the Cozy Crystal Mysteries. Book one, Moonstones and Murder, is already in print, and book two is on its way. His latest works include several collections of ghost stories, short works of general fiction, a collection called Angel Stories, and another collection called November Tales, inspired by the fiction of Ray Bradbury. He currently lives with his wife Margaret in the beautiful, secluded hills of Sussex County, New Jersey. You can contact him via email at email@example.com.