You can build a physically simulated piano synthesizer that can play a variety of keys. Using a microcontroller and some basic electronic components, you can create gloves to interface with this synthesizer, so you can play the piano without the keys.
Pianos are big. Heavy, too. They have a sound quite unlike any other. The genesis of this project came from the desire to have a portable piano that sounded just like the real thing. Other motivations were an interest in sound synthesis, as well as one of our team members being a musician. With that in mind, we created a wearable system to simulate piano key presses with a model based on physical string movements (see Photo 1). We developed a pair of gloves that are capable of playing a synthesized grand piano sound. The user pulls their fingers rather than make the traditional pressing motion associated with a real piano. We placed a flex sensor under each finger on the glove and connected said flex sensor to a Schmitt trigger so that, when the finger bends far enough, a Microchip Technology PIC32MX250F128B microcontroller physically simulates a piano key press and outputs sound through a 12-bit DAC connected to a speaker system. This enables the user to play the natural (not sharp or flat) notes between C3 and E4. Because many basic songs utilize the C major scale, these gloves are perfect for children or beginning musicians. The gloves provide a convenient way to practice when a piano is not available.
The software system was the more challenging and interesting part of this project. We essentially created a new model of a piano from the starting point of the Karplus-Strong algorithm. The data flow of our project begins with the user interfacing with the gloves. By flexing the fingers, the user changes the resistance of the flex sensors, effectively pushing the Schmitt triggers (associated with a particular finger) into the activated state. This signal then propagates into a port on the microcontroller. The microcontroller perpetually scans these 10 ports, each representing a finger (and, therefore, a piano key) in the music thread and structures in that thread set flags to simulate the key press upon a finger flex. This flag triggers the string simulations in a timer driven interrupt service routine (ISR) that wrote to the off-board digital-to-analog converter (DAC), which we connected to a final hardware low-pass filter and then to the final output of a female audio jack. The user can plug speakers into this output to listen to the notes they are playing.
In order to simulate a string, we need to discuss the wave equation, as waves propagating along a string are the source of sound from a piano. We were interested in a traveling wave, and in order to simulate it, we needed to compute the wave equation in one dimension—that is, the disturbance to the string from its original position as the wave propagates. The Karplus-Strong algorithm for string synthesis is a technique that allows us to model strings physically in order to produce plucking sounds. The Karplus-Strong technique treats a string as a discretized array of values rather than a continuous function. We were able to use a shift register to represent these discrete values and perform operations on them. Our implementations of shift registers in C were arrays with a corresponding index to keep track of the current values. As one performs operations on a shift register, this index is incremented, looping back to the beginning when it runs over. That is to say, on each operation, shift the index to look at the next value.
The main trade-off in software was the sample rate of our sound versus length of each string versus the number of strings to simulate simultaneously. The sample rate we ended up using for the final version was 11,025 Hz. We used this frequency because higher frequencies required the length of the shift registers representing the strings to be much longer, which ate into the memory requirements. It also shortened the number of cycles available between interrupts for the ISR and glove handling thread to run. If the number of cycles for the ISR to complete and return exceeds the period of the sampling frequency, the system fails, as the timer interrupts are disabled inside the ISR. This gave a ceiling on how many strings (three per key) we could simulate at once. Our final design was able to simulate 30 strings (all 10 keys) simultaneously at 11,025 Hz without overflowing on the ISR time.
Another major decision we made early on in the design process was to use the Schmitt trigger technique to turn the finger flexes into digital signals, rather than read analog signals and process them in software. Since we wanted the software system to be as lightweight as possible, with the priority of the synthesis being highest, it made more sense to use hardware to make the signals digital. Another reason for using the Schmitt triggers was that the PIC32 package we used did not have 10 ports capable of handling analog signals, so we would have had to scan the available analog ports, multiplexing in time to meet the need for reading from 10 different sensors.
The hardware for this project was simple in theory, but it required a large amount of fine-tuning in implementation. The glove system is essentially a large, complex sensor that feeds into the microcontroller through digital I/O pins. However, flexing a human finger is not necessarily a digital operation, and there is some ambiguity on what position a flexed finger is bending far enough to trigger a key press. We used flex sensors to map the bending of the fingers into the analog world. A flex sensor is essentially a variable resistor, whose impedance changes linearly as a user bends the sensor. In order to get values for the changes in resistance, we supplied a voltage across the resistor and integrated the resistor into a Schmitt Trigger circuit. As a finger flexed, the sensor (sewn into the underside of the glove) deformed with the finger and changed the voltage dropped across the resistor. Two wires came out of each sensor, and we attached them to a Schmitt trigger on a white prototyping board. Because the resistances of the flex sensors vary by up to 30%, we created a different Schmitt trigger for each sensor. That is to say, we needed to use different supporting resistors for each circuit to account for the different resistance ranges for each flex sensor. This is where the majority of the aforementioned fine-tuning came into play.
Figure 1 shows our inverting Schmitt trigger design with switching voltages at around 1.2 V. The output of each Schmitt trigger went into a pin on the microcontroller. By bending the finger and thus flexing the sensor, the Schmitt trigger would output a high value, and by straightening the finger again, the Schmitt trigger would output a low value. The outputs of the Schmitt Trigger circuits fed into individual I/O pins on the microcontroller, and from there software could handle reading the high and low signals generated by finger flexes. Photo 2 shows the assembled Schmitt triggers on the whiteboard. Photo 3 shows the analog input to the Schmitt trigger as well as the digital output. We sewed each flex sensor to the underside of each finger so the flex sensors made full contact with the fingers. We decided to put them on the underside of the finger, because it made a slightly larger flex, and it gives feedback where a physical key would be.
The other hardware in this project was the actual sound output setup. This system consisted of a Microcontroller Technology MCP4822 12-bit serial DAC that outputted through a hardware low-pass filter. The hardware filter was used to soften the sound, getting rid of high-frequency noise and smoothing the harsher pluck artifacts from the piano notes.
The Schmitt Triggers provided a robust interface to reduce analog noise from reading directly from the flex sensors. A different glove design or even different input method would still work with our software, so long as it is a digital input. This was beneficial since every hand has a different shape and size.
While testing our design, we found it difficult for some to move certain fingers independently from other fingers, which forced us to tune the position of the flex sensor on the glove. Having a flex sensor, which is lower on the finger, might work better for people will small hands. But people with larger hands might have trouble flexing to the triggering threshold. The glove design is capable of modification, so that people with less dexterity in their hand could trigger a key press after further tuning.
No one in our design team had the dexterity to move our pinky fingers without also contracting our ring fingers. This gave us a fat-fingered problem when trying to play one note at a time. Table 1 outlines the calibrated resistors we used in our implementation.
The main challenges in building the software for this project were associated with simulating a realistic sounding piano in real time. Storing samples of a real piano was determined to be infeasible due to the memory constraints of the PIC32 package we were using. We wanted to keep the same package that we had been using and generating the sounds ourselves seemed like a more interesting challenge than simply playing back prerecorded sounds.
Several parts of a piano needed consideration when we were creating the physical model of a piano’s sound. Each note consists of a hammer covered in felt striking two or three strings upon a key press. The tuning of the two or three strings is such that they are at slightly different frequencies, but still have a known fundamental frequency when struck. These other strings clustered around the middle string being a few Hertz higher and lower than the middle string. The lower notes on a piano have strings that are heavier and longer than the strings for the upper notes, and thus can behave differently than the higher note strings. When the hammer strikes the strings, they start to vibrate against the soundboard inside the piano, which then amplifies the string vibrations to make a perceptible sound. We tried to take as many physical features of a piano into account as possible when building our model.
The Karplus-Strong algorithm works by loading some initial waveform (often noise, or in our case a low passed triangle wave so as to better model a hammer strike) into a shift register of a desired length. The lengths we used were directly proportional to the lengths of physical strings. The shift registers allow the changes in values to propagate along the string, as occurs in the physical world. Each value of the shift register feeds through a delay line with a low-pass filter (in our case, a function that returned a damped average of the last two samples) and then through an all-pass filter that served as a tuner, in our case. The use of the low-pass filter causes the higher frequency components of the waveform (which are typically associated with noise and the “twang” of a string pluck) to be attenuated much more quickly than the lower frequency components (which are the actual fundamental frequencies of the piano note). This procedure executes at the sample rate of the outputted signal (in our case, around 11 kHz). We used this algorithm in our design to model the dynamics of the individual strings. The sound of the piano came from other modifications we made in software, such as combining several strings together (see Figure 2).
Enhancing the Karplus-Strong algorithm, our model consisted of three major components: multistring simulation per note, second-strike emulation, and full-body echo effect. Multistring simulation is a technique that boils down to simulating three physical strings for each note. We took this directly from actual pianos. We represented the strings as shift registers, as is called for in the Karplus-Strong algorithm. Three such strings are contained in a data structure we called Key in the software. The Key structures also contained relevant tuning information about each of the three strings. We tuned the middle string to resonate at the fundamental frequency of the note, with the other two strings tuned higher and lower than the fundamental string. The other two strings that surrounded the middle, fundamental string were tuned to be a few Hertz higher and a few Hertz lower than the middle string. We did this by changing the shift register length to longer (for a lower note) or shorter (for a higher note). This clustering gave the sounds an added richness, as well as made the harmonics of the string more complex.
Second-strike emulation was an idea to replicate the second hammer strike that occurs in an actual piano. That is to say, when someone presses down on a piano’s key, the hammer slams into the strings, bounces off, and then hits them again after bouncing off the mechanism that pushes the hammer down. This whole procedure happens very fast and is almost imperceptible to the human ear. However, it makes a difference in the sound of the note, and, when we added it to our model, it made a big difference in the quality of the sound produced. In order to create this effect in our model, we added a counter that began counting samples after the initial press. When it reached some threshold, we reloaded the shift registers to their initial state’s post-hammer strike. Full-body echo refers to the richness in the notes produced by a physical piano when someone presses a key and then allows the note to decay away. The implementation of an echo was trivial, as we just put another shift register in the model that would record the most recent sample sent to the DAC and add it to the output after a delay. We were using a simple, damped echo, and the effect was immediately noticeable in our testing. The piano notes instantly carried more weight and were much richer.
We implemented the Key structure to keep track of all associated data for a particular note. There were a few different groups of information kept within this structure: the lengths of the different shift registers, the different shift registers themselves (each representing a string), variables containing information relevant to the different filters (these include some tuning values that we kept in the data structure), and variables containing the current indices of the shift registers. We represented the shift registers as arrays, with the index variables acting as pointers to the current cell to update or output. We used a fixed-point type rather than floating-point variables. The fixed-point type is essentially an integer, but we treat the lower 16 bits as fractional portions of an integer. This gives us precision up to 2–16 while still giving a signed range on the order of 215 for the portions to the left of the decimal. Using fixed-point saves the unnecessary cycles required to compute floating-point types, while still giving fractional precision to around five places to the right of the decimal. This was a necessary optimization for our data structures, seeing as all computation took place within a time-sensitive ISR, where the number of cycles between interrupts was limited.
Our ISR made up a large portion of the software for this project. A timer triggered the ISR at the sample rate, which for us was around 11 kHz. The basic architecture for the ISR involved taking each key and reading the pressed flag. The pressed flag was set in the thread reading the I/O pins linked to the gloves, with the flag set on a flexed finger and cleared on a straightened finger. If set, the pressed flag activated the different filters and operations performed on the shift registers for a particular key, until finally outputting a sample for that ISR cycle. If cleared, the ISR passed over that key and moved to the next one to check its pressed flag as well.
As seen in the fast Fourier transforms (FFT) included in Figure 3, a real piano has much more complex sound. However, to the human ear, there is a smaller difference in the sound. Visit our project website (http://bit.ly/2c5qRL4) to hear songs played on both a real piano and our gloves.
The initial shape of the string, when hit, made large differences in the quality of the sound. Originally, the Karplus-Strong algorithm populates the shift register with filtered random values before starting the iterative low-passing and all-passing characteristic of the technique. That is to say, we fill the shift register with random values and then perform some filtering (in our case, a low-pass filter developed in our MATLAB simulations) before starting the iterative part of the algorithm. This random noise provides the high-frequency “twang” that is characteristic of a plucked guitar string. To simulate the mallet dynamics, we tried a sawtooth wave and triangle wave before settling on the triangle as the better sounding hit. We used a triangle wave that filled some portion of the shift register and then performed the same low-pass filter used for the plucked string before the iterative part of the algorithm. We also found the placement of the initial wave made differences to the sound quality (i.e., where on the string we struck it). When we placed the wave too close to the string boundary, it resulted in much higher frequency components.
As you can see in Photo 4, the mallets actually hit close to the middle of the string. This allows more of the string to vibrate close to the time of initial hit. You can also see how each key consists of three strings of slightly different lengths, which we implemented, but also how the lower notes only have two. Not shown are how the lowest notes on a standard piano are actually just one much thicker string. When we alter our code to play notes in these lower octaves, the sound is off because of the different style of producing sound. In order for our model to deal with these lower strings, we would have to change the algorithm completely to deal with non-ideal strings because of the higher mass of the lower strings. We found the sweet spot of our algorithm to be from C3 to E4.
Our piano gloves were successful at playing all of the white keys within a 10-note span on a piano. Because we tuned each note individually, the frequencies played were the correct frequencies of each note. We were able to play all ten keys at one time, which is equivalent to simulating 30 different strings at a time. This project cost us under the $100 allotted budget we set for ourselves. We believe we have achieved a piano-like sound rather than a plucking, bowed, or percussive sound.
To expand on this project, we would add the ability to play the black keys on the piano. We would also add the ability to shift octaves to make a larger variety of notes available. We could have incorporated a display so that the note playing would show on the screen for the user to see. Since we followed the standard of using a 3.5-mm audio jack, we are able to make additional upgrades. It would also be an interesting to build a custom amplifier and speaker set, with the tuning matching our synthesizer. This would provide the highest quality output of sound.
 J. O. Smith, “High-Accuracy Piano-String Modeling,” Physical Audio Signal Processing, 2010, http://ccrma.stanford.edu/~jos/pasp/.
Microchip Technology, “MCP4821/MCP4822: 12-Bit DACs with Internal VREF and SPI Interface,” DS21953A, 2005, ww1.microchip.com/downloads/en/DeviceDoc/21953a.pdf.
MCP4822 DAC, MCP6242 op-amp, MicroStick II development platform, and PIC32MX250F128B microcontroller
Microchip Technology | www.microchip.com
Flex Sensor (FS-L-0055-253-ST)
Spectra Symbol | www.spectrasymbol.com
Mouser (distributor) | www.mouser.com
PUBLISHED IN CIRCUIT CELLAR MAGAZINE • NOVEMBER 2016 #316 – Get a PDF of the issueSponsor this Article