Security Scrutinized
It’s not so easy to keep with all the new security features on the latest and greatest embedded processors—especially while you’re busy focusing on the more fundamental and unique aspects of your design. In this article, Colin helps out by examining the new processor cores using TrustZone-M, a feature that helps you secure even low-cost and low-power system designs.
— ADVERTISMENT—
—Advertise Here—
You might feel like the moniker of “embedded security” is becoming the latest bullet point added to datasheets or marketing material of every chip coming out nowadays. But that doesn’t mean everything is snake oil. In fact, there are many useful devices and features that have been released recently. With that in mind, I want to devote an article to talk about some of these, because they could make your life easier if you are looking for the “best” secure devices. I also wanted to show why you can’t just trust the press release (or datasheet!) for every feature.
In this article, I am going to discuss a new series of devices that feature a hardware separation of secure and non-secure code. The idea of separating security domains is a fundamental one, and it’s a critical part of the idea of “defense in depth.” Defense in depth means not having only a single layer of defense—you never want someone who found a single flaw in your system to suddenly be able to take complete control of it. The basic idea is shown in Figure 1. Note there is only very basic code present in the “secure” zone, whereas the “non-secure” section includes complex code such as parsing of data structures and third-party libraries which we are using.
Having a hardware barrier means that we can have a section of “highly sensitive and trusted” code. That section handles the crown jewels, such as cryptographic keys and allows debug access. This trusted code is as simple as possible—we want to ensure there is no vulnerabilities such as a bug in the command parser allows reading past the end of a data structure. By keeping complex code—which is fundamentally more difficult to secure—in an “untrusted” zone we reduce the likelihood that a security flaw in the non-secure code results in a complete compromise. Even someone with total control of the “non-secure” side has no ability to read memory in the “secure” side. With that general idea in place, let’s look at how this is implemented and used in real devices.
— ADVERTISMENT—
—Advertise Here—
On Cortex M23/M33
This feature is present in a new core from Arm, the M23/M33 core. This includes something called TrustZone-M, which allows the separation of code into two segments like I demonstrated in Figure 1. Note that TrustZone-M isn’t a separate core—rather it enables the processor core to switch between a trusted and untrusted mode. Memory segments and peripherals can be “marked” as secure or non-secure, and such access is enforced at a low level by the core (in theory).
The M23/M33 core is unique since it represents a low- to medium-end type microcontroller (MCU) solution with those features. The M23 core is similar to a Cortex-M0+ in many respects, and M33 similar to Cortex-M4. The M23/M33 is interesting because it’s designed to avoid the “trade-off game.” In other words, just because you’re targeting ultra-low power and low cost, it doesn’t mean you need to be stuck with poor security.
So how do you use it? For the most part, the application is developed as two separate projects with current compilers. This means an entirely separate codebase generates a separate binary that is mapped to the “secure” memory area. That binary is combined with a “non-secure” binary that can only access the non-secure code and memory spaces. Beyond just memory space, individual peripherals are also mapped into secure or non-secure code-spaces. You need to be a little careful here. Some peripherals may have non-obvious mapping between the security domains. An ADC from the non-secure space for example could measure power related to computations performed in the secure space. I demonstrated such an attack in a paper entitled “Cross-Domain Power Analysis Attacks” released last month. See the Circuit Cellar article materials webpage for a link.
To call between the secure and non-secure areas, a security veneer is used. This veneer is comprised of special functions that define the exact features allowed to be called by non-secure code. These would be, for example, the arrows in Figure 1 that cross between the secure and non-secure codebases.
The example in Listing 1 shows what such a veneer looks like. Compared to some other hardware security domain solutions, this security veneer is straightforward to use because you can define arbitrary functions and use them like normal. You don’t need to worry about stuffing data through a buffer or something similar. The M23/M33 cores are a clear step in the direction of moving away from secure code feeling like a complicated mess. And they are a step towards deploying secure code being closer to regular development.
— ADVERTISMENT—
—Advertise Here—
/*
* Non-Secure callable function to perform arbitrary encryption operations as an example
*/
void __attribute__((cmse_nonsecure_entry)) nsc_func_enc(const uint8_t *keys, uint32_t key_len, const uint8_t *src, uint8_t *dst)
{
//idau_aes_enc is a funciton only callable from within secure space
return idau_aes_enc(keys, key_len, src, dst);
}
LISTING 1
Shown here is an example of the security veneer that makes it simple to move arbitrary functions into the secure code-space.
Note there are other features of M23/M33 cores beyond just the hardware barrier. For example, they include execute only memory (XOM) memory. You cannot perform a “load” operation from this memory space. If you want to prevent someone from reading out your code, XOM means that the only source able to read the memory is a path leading to the instruction decode logic. Using XOM requires compiler support, because, for example, the compiler cannot use look-up tables within that region.
Beyond XOM you also find support for features such as enforcement of stack pointer bounds to prevent, or at least detect, overwriting of stack frames. Overwriting a stack frame allows an attacker to change a return address, which can be used to either execute instructions from another buffer or change the return address. Incidentally, that’s part of a Return Oriented Programming attack that I discussed in previous articles, including “The Populist Side-Channel Attack: An Overview of Spectre” in Circuit Cellar 334, May 2018.
Processors with M23/M33
Now, with all that fluff about the M23/M33 core you might wonder: Where can you find one? The first device to market under this moniker was Microchip Technology with its SAML11 (M23 core) MCU. That was followed by the Nuvoton M2351 (also M23). The Nordic Semiconductor nRF91 was the first M33 device on the market, followed by NXP Semiconductor’s LPC55S69. Finally, STMicroelectronics (ST) is offering its STM32L5, but as of writing this article, it is not yet actually available for purchase.
In this article, I’m going to detail two devices I’ve used: The Microchip SAML11 and the NXP LPC55S69. The Microchip (formerly Atmel) SAML11 is a M23 based device, which again should be seen as a replacement for other low-power devices such as a Cortex M0+. As you might expect it’s available in the low pin-counts and small packages that is typical for such a part. On top of the M23 core, it adds several interesting security features that might jump out at you.
One of particular interest is something called “silent access.” If you’ve read this column regularly, you’re already aware of my various side-channel power analysis demonstrations. Silent access is an attempt to reduce the effect of this—it effectively halves the usable memory for the region enabled. This enables you to split a 32- bit word into two 16-bit words, where the complement of each bit is also stored in memory as shown in Table 1.
The idea here is that you always have the same number of “1”s read from each word in memory. If you remember my various demonstrations, you’d remember that the number of “1”s read from memory is associated with the power consumption at the instant in time the read happens. Attackers use this to learn something about data being manipulated, which is often enough of a toehold to completely break many cryptographic algorithms.
In isolation this sounds great. But this is also implemented in something called Trust-Ram (TRAM) as shown in Figure 2. You’ll notice that the data stored in TRAM also goes over the main bus—labeled APB for Advanced Peripheral Bus—which means that the data may be power-analysis resistant inside the TRAM device, but as soon as you are loading it into the main core, this falls apart.
TRAM does add some other interesting features. The “scramble” feature could help prevent data-remittance attacks, which are commonly used to read data “left over” in SRAM. In other words, when an SRAM is erased but its contents are not cleared effectively. Provided there are no leaks of the scramble key, you can be more confident that data stored in TRAM will be more difficult to recover from data stored in regular SRAM—even the SRAM available from the secure domain only.
Being the first M23 device to market, the SAML11 already has third party support in various compilers and libraries. While these cores are still relatively new, you may find it worthwhile to experiment with them using one of the development kits. I suspect that once they become more well known, the M23 core devices will become a mainstay of even low-cost IoT devices.
ON LPC55S69 and Uniqueness
The NXP device is based on the M33 core, which is more powerful than the M23 core. The LPC55S69 extends that even further because it’s a dual-core device. So, from a performance and pin count perspective, this device is a rather different part than Microchip’s SAML11. But, like the SAML11, it includes some special security features beyond the M33 core.
The most prominent of these is the use of a Physically Unclonable Function (PUF). The idea of a PUF is that you have something integrated into the silicon fabric that allows generation of unique “fingerprints.” Rather than just use a pre-programmed fixed identifier, which could easily be cloned, the identifier is based on various physical properties of the device itself along with input data. Typically, the PUF has the property that it generates a unique output dependent both on the physical property of the device and some given input. The LPC55S69 uses SRAM-based PUFs. After powering-on a SRAM array, it has effectively random data present. This pattern is used to generate unique keys, and by design some of those keys cannot even be read out by the code, but only routed directly into the cryptographic modules. This is shown in Figure 3, where the PUF module connects directly to various cryptographic modules.
Of course, this requires some consideration for how you will use the PUF and encryption engine. Since it’s impossible (NXP claims) to read out the PUF key, you cannot generate firmware updates on a remote server already encrypted with this key. But you could use a much slower decryption method (asymmetric) to do the firmware update. Once you have a decrypted firmware it is re-written to flash with the PUF key.
One possible problem here is that, while the PUF key cannot be read out directly, side-channel power analysis could allow you to recover the encryption key being used. To provide some additional protection, the LPC55S69 has a special mode called AES Indexed Code Block (ICB) mode. This mode is not one of the standard AES modes, but instead a much slower mode that uses the standard AES engine to generate a changing keystream. Figure 4 shows a portion of how this works—each of the “E” blocks is a standard encryption mode. You can note how it generates various keys from k0 to kq-1 from the original PUF key. Because different keys are used for every block, it substantially complicates side-channel power analysis attacks.
Since the regular AES mode isn’t protected against side-channel power analysis, a source of PUF key leakage is reusing the PUF key between AES-ICB mode and any other AES mode. It should be clear that taking advantage of some of these new security features requires careful usage to avoid accidently introducing security flaws.
In addition to AES support, a hardware cipher is included specifically for encrypting/decrypting the flash memory. This is shown in Figure 3 with the PRINCE block, which is a fast stream-cipher. Again, this can be useful with some caveats. It won’t protect you from an attacker reading out memory from your already running application. Presumably the encryption key is already loaded in that case, so the attacker would see the seamless decryption working as intended. But it does help prevent a case where an attacker is able to attack the bootloader or debug interface, and read out “raw” flash memory. This raw flash will be the encrypted data.
Keep a Skeptical View
It’s always good to remain skeptical of any new security claims. But looking at some of the new devices coming out shows that serious thought has gone into their designs. The most likely failure won’t be the devices themselves, but using them in such a manner that subtly undermines the security features.
Unfortunately (for now) the manufacturers aren’t keen to point out these weaknesses. But it’s important to understand them so you can avoid weaknesses in your design that will be difficult to fix later. Hopefully, I’ve given you a useful overview of what these devices accomplish and where you might find such weaknesses, so you can go out and use them successfully.
The usage of these devices currently requires two separate applications—secure and non-secure—linked together. You might ultimately find this makes life easier to get started. If you are already using a bootloader, for example, this can be easily moved into the secure space without much effort compared to your current workflow. With a range of devices including both low-cost and high-performance, you should find something that suits your project requirements.
Additional materials from the author are available at:
www.circuitcellar.com/article-materials
RESOURCES
NXP Semiconductors | www.nxp.com
Microchip Technology | www.microchip.com
Nuvoton | www.nuvoton.com
STMicroelectronics | www.st.com
PUBLISED IN CIRCUIT CELLAR MAGAZINE• July 2019 #348 – Get a PDF of the issue
Sponsor this ArticleColin O’Flynn has been building and breaking electronic devices for many years. He is an assistant professor at Dalhousie University, and also CTO of NewAE Technology both based in Halifax, NS, Canada. Some of his work is posted on his website (see link above).