CC Blog Projects

RPiano: A Playable MIDI Synthesizer

On a Raspberry Pi Microcontroller

Eager to explore the interface between music and electronics, and the digital representation of music, we created RPiano: a portable, playable MIDI synthesizer on a Raspberry Pi Pico (RP2040). We developed RPiano over the course of four weeks as our final project for Cornell University’s course Digital Systems Design Using Microcontrollers. This article details our experience building RPiano.


  • How can I build my own Synth?
  • Can a synth play MIDI files?
  • What’s a fun Raspberry Pi project?
  • Raspberry Pi Pico

RPiano, our digital adaptation of the traditional mechanical instrument, consists of a physical keyboard ranging over two octaves. The keys can be played like a regular piano, as shown in Figure 1. On pressing a key, our synthesizer produces the note sounds digitally, just like a traditional piano would mechanically. The key shape and size match exactly that of a traditional piano key, making the transition from a traditional piano to RPiano fairly smooth.

FIGURE 1 
RPiano model
FIGURE 1
RPiano model

RPiano also has several built-in features that can be accessed by pressing the relevant buttons located just above the keys. Our current prototype has five stored songs, each with its own control button (Figure 2). Pressing the button for a song plays that song through a pair of attached speakers, digitally producing all the notes in the song, played in time to match the song’s rhythm. Additionally, the prototype has three different instrument modes (with corresponding buttons) that simulate the following three instruments: a grand piano, a harp, and bells. When a particular instrument mode is activated, the key presses on RPiano play the note with the tone of the instrument selected. Lastly, RPiano supports playing both the physical keyboard and a chosen pre-stored track simultaneously. This facilitates duets: the user can play one part on the keys while the in-built synthesizer plays the other.

FIGURE 2
Keyboard and buttons
FIGURE 2
Keyboard and buttons

While the buttons correspond to a few preselected songs, RPiano serves more broadly as a general-purpose synthesizer. It can synthesize any music file stored in the industry standard Musical Instrument Digital Interface (MIDI) format. Thus, RPiano’s compatible with millions of existing files—anything from the latest pop hits to Mozart’s timeless symphonies—with no additional processing. The user can easily change RPiano’s set of songs by supplying MIDI file paths for each song in the preferred set when compiling the software for the device.

The high-level structure of our project can be seen in Figure 3. There are three different types of user inputs: physical key presses on the keyboard, button presses to play a song, and button presses to switch instrument modes. Each of these modifies either the notes played or the kind of sound that is produced.

FIGURE 3
High-level overview of the project
FIGURE 3
High-level overview of the project

We used frequency modulation (FM) synthesis to synthesize the audio output and implemented the FM synthesis algorithm in software. The synthesis generates a final output wave using the set of notes to be played, and the kind of sound to be produced (piano, harp, or bells) based on user input. The output wave is sent to a pair of speakers and played out loud.

PROJECT HARDWARE

The project was built using the RP-2040 chip, the chosen microcontroller (MCU) for our course. Its high performance, low cost, and compact size made it ideal for our project. The other hardware components of the project include touch sensors to sense user inputs from buttons and touch-sensitive keys, as well as the hardware used for audio output (DAC, speakers). Our entire circuit schematic is shown in Figure 4. The key functions of specific components have been described in further detail in this section.

FIGURE 4
Circuit diagram for the project
FIGURE 4
Circuit diagram for the project

Touch sensing: To detect key presses, we used human conductance to utilize a similar effect to capacitive sensors. We placed a 1MΩ pullup resistor on the input, and used a metal covering on the keys to make them conductive. When a grounded person touches the contact (metal key), they close the circuit with their body, which pulls the input pin low. By sensing the voltage on the input pin, we could detect whether the key was pressed (circuit complete) or unpressed. Various tests to measure voltage values revealed that the values remained consistently above 3V when there was no contact with the key, while the values ranged between 0.5V and 1.1V when the key was pressed. This left sufficient room for our voltage cutoff to be at 1.2V.

Key set-up and wiring: We made the keys conductive using aluminum foil to wrap the black keys, and copper tape for the white keys. We secured a long copper wire to each key such that each wire was in contact with the metal surface of the key. These wires were connected to the input sensing on the RP2040. We made all our connections on breadboards to ease prototyping, but they could be directly soldered for more reliable connections. We built the keyboard on a cardboard box, with the electronics inside.

Multiplexers: With only 28 GPIO pins on the MCU, we could not attach each of the 29 keys to individual pins. We decided to multiplex the inputs from the keyboard to be able to detect presses on all the keys. We chose to use two 16×1 analog multiplexers. The key inputs were connected to the inputs of the two multiplexers. The multiplexers required 4 GPIO pins to select which of the 16 inputs should be passed through to the common output. The select inputs were varied through software to read each of the 16 inputs (16 keys on each multiplexer). We were able to reuse the same selector signal for both multiplexers. This system enabled us to utilize 29 analog inputs while using only four selector pins and two input pins, which freed up the other pins on the RP2040 to handle the other button inputs and the speaker output. This also allowed easy extensibility to a larger keyboard. With additional multiplexers, the sensing capabilities can be expanded to 64 keys with only two additional analog input pins.

Digital-to-analog converter: We used an MCP4802 digital-to-analog converter (DAC) to send our output signal from the RP2040 to a set of speakers. The DAC uses the Serial Peripheral Interface (SPI) to take digital input from the RP2040 and convert it to an analog signal. This allows us to control a speaker from the RP2040.

Speakers: We used standard desktop speakers for output. It was important that the speakers have their own power source since the RP2040 and DAC are incapable of providing the current required to play music at our desired volume.

PROJECT SOFTWARE

On a high level, our software includes three primary components, and is written in C. The block diagram in Figure 3 shows, at a high level, how the different software components fit together.

  • Part 1 is the FM synthesis algorithm to compute the wave output written to the DAC to send to the speakers.
  • Part 2 comprises the user input detection, detecting piano key presses, instrument mode button presses, and song button presses.
  • Part 3 is the software for playing songs. It handles the song notes to be pressed and released based on stored metadata for a selected song.

Another piece of software, separate from the code running on the MCU, is a Python script to parse a chosen MIDI file, and store it in the required format in program memory of the program running on the MCU.

Our implementation is split into several threads, referred to as “protothreads,” using the protothreads library, a light-weight, stack-less, threading library written entirely in C [1].

SYNTHESIS THEORY

We chose to digitally produce the sounds, using FM synthesis to compute amplitude values for a note at a particular frequency, and using additive synthesis to combine notes at different frequencies into a single output waveform.

FM synthesis: FM synthesis is a method of sound synthesis that involves modulating a waveform using another waveform. Two waveforms are generated, and one is used to modulate the other, as shown in Figure 5.

FIGURE 5
Waveform illustrating frequency modulation synthesis
FIGURE 5
Waveform illustrating frequency modulation synthesis

The two waveforms are controlled by a logic structure that sets the value of each waveform at every time point. The value is based on how long the note has been played and the relevant attack, sustain, and decay parameters. At each time step the modulating waveform is calculated first, and then its amplitude is used to determine how far to step the main waveform along a precalculated sine table. This causes the main wave to progress through the sine table at different speeds based on the value of the modulating waveform. This modulated frequency can simulate many instruments better than the single pitch that the basic synthesis algorithm generates [2].

Additive synthesis: For notes played at the same time (such as a chord), we used the principle of additive synthesis to add together all the amplitude values to create a sound comprising all the frequencies. This is simple, and only requires that at each time step we sum the amplitudes of every note that is playing. We then divide by the number of notes playing to normalize the volume. Without normalizing, the output signal could spike in volume when notes are pressed or released.

SYNTHESIS IMPLEMENTATION

The core FM synthesis is done in an interrupt service routine (ISR), computing values for the final wave output that is sent to the speakers through the DAC. Our implementation for the FM synthesis builds on an example by Bruce Land at Cornell University [2].

The wave output values for producing the sound for a particular note cannot be precomputed and pre-stored for each note. This is because the output frequency at which values are written to the DAC needs to be high to achieve reliable (not distorted) sound output, and it would require too much memory to store thousands of samples for each note frequency. Thus, the output values are computed in real time. These values are written to the DAC each time the ISR executes. To write the DAC at the high frequency required for good sound quality, it is essential that the computation is complete before a new value needs to be written. Through experimentation, we arrived at an optimal time interval value of 36 microseconds—large enough to leave sufficient time for completing required computations but small enough that the sound output is smooth and pleasant to the human ear without any distortions. This corresponds to an output sampling frequency value of approximately 27.7kHz. Frequencies lower than this gave a distorted output, leading to sounds that were not smooth, while at frequencies higher than this, with a shorter interval the computation was not completed in time.

The efficiency of our design depended heavily on optimizing the ISR to be as fast as possible, ensuring that a new DAC value was ready every 36 microseconds, without running out of time in the ISR before the next value needed to be completed. To play any MIDI file, our implementation needed to support playing any note in the entire range of the piano (88 keys). Additionally, whether the note is pressed (and needs to be included in the output wave) is controlled by an external input choice (physical keys, or choice of song), and constantly updated by external threads. Checking all 88 keys in the ISR to see if they were pressed and then computing the required waveform for the frequency corresponding to the key took up too much time and led to the ISR running out of time before completion. This produced distorted sounds.

We approached this problem by adding a buffer for notes that could hold 10 unique notes at a time (corresponding to 10 fingers on the piano). This way, each time a note needs to be played (either on detecting a physical key press on the keyboard or a note play event in the song), its note number is added to the buffer. The ISR now only loops through the 10 notes in the buffer, checks whether they are pressed or not, and then includes them in the synthesis computation. The threads handling user input add to this buffer read by the ISR, as depicted in Figure 6.

FIGURE 6
Software high-level overview
FIGURE 6
Software high-level overview

There is also another FM synthesis control thread that sequences the synthesis ISR, precomputing fixed point constants to make the ISR faster. The buffer is implemented using an array, with each element in the array contained in order sorted with respect to when it was added to the buffer. When the buffer is full and a new note needs to be added, the key that was least recently played is removed from the tail end and the new key is added at the front.

The size of the buffer is a configurable parameter that can be changed based on the number of unique voices needed to be played at the same time. The tradeoff of making the buffer too large would be, however, a lower sampling frequency due to the increase in computation time in the ISR to handle computations for a larger number of notes at the same time.

USER INPUT DETECTION

A separate protothread handles physical key press detections. For a total of 29 keys on the physical keyboard, we use two 16-input multiplexers, each connected to two separate ADC input pins. MCU functions are used to read the input values, and switch between reading the two ADC inputs.

The value read from the ADC is converted to a voltage value by multiplying the read input value by a conversion factor defined through experimentation. We defined a voltage cutoff constant, and the press was detected by checking if the resultant value read was lower than the specified voltage cutoff value. Through experimentation with our physical setup, we determined a value of 1.2V for the voltage cutoff.

If the key press was detected, the note corresponding to the key was set to play. To prevent detecting a single key press twice, we also stored a previously pressed Boolean value, and the note press was only set if the key was detected to be pressed and was previously not pressed. Additionally, the current state of the key was stored, so that it would be considered “released” in the synthesis computation only when the finger was lifted. This enabled us to store information so that the produced sound’s length was based on how long the key was pressed.

BUTTON PRESSES

Appropriately, a thread called “button press” detects button presses. This thread checks to see if each of the five buttons corresponding to the songs has been pressed by reading the GPIO pin of the button. When a press is detected, the song data corresponding to the chosen song is loaded into a global variable. It also enables pausing and playing the song if it’s currently playing or not playing, respectively.

Instrument button presses are detected in a similar manner. If pressed, the parameters of the FM synthesis are changed to be the values tuned for the instrument corresponding to the button. These parameters are accessed by the ISR for synthesis, generating modified sounds based on the parameter changes.

MIDI FILE REPRESENTATION AND PARSING

MIDI files are the industry standard for passing musical performance information among electronic musical instruments and computers. Unlike an MP3 or WAV file, it does not contain real audio data, but instead, the notes played, their timing, duration, and desired loudness, in sequence. Since it does not store audio data, it is much smaller in size than an alternative MP3 file, so it’s ideal for our project with limited data storage. It’s also compatible with different instruments—it needs only to play the frequency corresponding to a given note on the chosen instrument. Further, MIDI files make it easy to change tempo based on the user’s preference. Each MIDI note number is mapped to a particular frequency that corresponds to a note. For example, MIDI note number 60 corresponds to middle C on the piano (C4). We used Equation 1 to map MIDI note numbers to a frequency used for FM synthesis:

The MIDI format consists of a list of events, such as “KeyOn” or “KeyOff,” that correspond to note activation and release on a keyboard. We chose to use the Mido library in Python to parse this information [3]. We wrote a script to read any MIDI file and store the required data to play the song on our synthesizer. The script takes in a MIDI file as input, prompts the user to choose a track contained within the MIDI file, and then parses a sequence of MIDI events corresponding to the track selected.

We chose to represent each MIDI event with three fields: the note to press, the note to release, and a hold time (the time to wait before performing this event). The hold time stored is a relative value and is converted to a time in milliseconds by multiplying by a constant conversion factor. Storing this time-based information enabled us to store very concisely enough data to reproduce the song’s exact rhythm and playing style (adhering to different elements of music like rests), note values (how long each note is played), as well as the pitch (the frequency of the note).

After reading all events, the script accumulates a list of events, writing this in the form of data to be stored in program memory. On detecting a song play button, the code iterates through each event in the song data corresponding to the chosen song. Before each event, the hold time in milliseconds is calculated from the relative value stored. This is done by multiplying a delay tick value, 1000ms, by a constant conversion factor for the song. This conversion factor for the song can be changed to speed it up or slow it down.

PERFORMANCE

Over four weeks, we were able to create a playable keyboard that successfully detects key touches and plays the required note. We were also able to play any readily available MIDI file on our synthesizer, making use of the entire range of the piano (88 keys), and handling songs that contain a wide range of notes and different, complicated rhythms. The pieces, when played on our synthesizer, closely modeled the sound of a real piano, and exactly replicated the rhythm and pitch specified in the MIDI file. Circuit Cellar’s Article Materials and Resources webpage contains a link to a video of RPiano in action [4].

FUTURE WORK

Overall, our design meets our expectations. In some areas, it even exceeded our expectations—we did not expect to be able to handle more complicated songs with many chords and quick notes smoothly. In terms of the physical design, while wiring the keys with tape and using the breadboard was a quick solution that worked smoothly for the most part, in certain cases a key press would not be registered while testing due to a wire slipping. Soldering the wires onto the metal for the keys and onto a board would improve this issue, and would make the design more foolproof and durable.

There are also multiple extensions that we’d planned as stretch goals that we could implement in addition to the existing functionality. For the songs, we did not use the volume information encoded in the MIDI file since we wanted to be consistent with volume across songs and the keyboard. The code could be altered to include changes in loudness on the keyboard, as well as effects like piano and forte in sheet music. Another possible extension is to add the three pedals to the piano that create the sustain effect—we could modify the FM synthesis parameters when the pedals are pressed. Finally, we could also enable the user to change the FM synthesis parameters through a bar using a potentiometer. This would allow users to dynamically alter the kinds of sounds produced, rather than just having three different modes.

Acknowledgments

I would like to acknowledge Ben Manninen, a student at Cornell University, who worked with me on this project. 

REFERENCES
[1] “Protothreads—Lightweight, Stackless Threads in C,” dunkels.com: http://dunkels.com/adam/pt/[2] B. Land, “ECE4760 PIC32 sound,” people.ece.cornell.edu: https://people.ece.cornell.edu/land/courses/ece4760/PIC32/index_sound_synth.html[3] “Mido-MIDI Objects for Python—Mido 1.2.10 documentation,” mido.readthedocs.io: https://mido.readthedocs.io/en/latest/[4] RPiano demonstration video: https://www.youtube.com/watch?v=z0CkNf_g0mQ

SOURCES
A. Radović, “csnake: C code generation helper package.,” PyPI. https://pypi.org/project/csnake/ (accessed May 17, 2023).

H. Adams, “ECE 4760 Course Webpage,” https://ece4760.github.io

B. Land, “RP2040 DSP,” https://people.ece.cornell.edu/land/courses/ece4760/RP2040/C_SDK_DSP/index_vga_dsp.html (accessed May 17, 2023).

RESOURCES
Microchip Technology | www.microchip.com
Raspberry Pi | www.raspberrypi.com
Texas Instruments | www.ti.com

Code and Supporting Files

PUBLISHED IN CIRCUIT CELLAR MAGAZINE • NOVEMBER 2023 #400 – Get a PDF of the issue

— ADVERTISMENT—

Advertise Here

Keep up-to-date with our FREE Weekly Newsletter!

Don't miss out on upcoming issues of Circuit Cellar.


Note: We’ve made the Dec 2022 issue of Circuit Cellar available as a free sample issue. In it, you’ll find a rich variety of the kinds of articles and information that exemplify a typical issue of the current magazine.

Would you like to write for Circuit Cellar? We are always accepting articles/posts from the technical community. Get in touch with us and let's discuss your ideas.

Sponsor this Article
+ posts

Samiksha Hiranandani (snh44@cornell.edu) is an undergraduate senior at Cornell University studying Computer Science with an external specialization in Electrical and Computer Engineering. She is excited by the integration of software and hardware for engineering solutions.

Supporting Companies

Upcoming Events


Copyright © KCK Media Corp.
All Rights Reserved

Copyright © 2024 KCK Media Corp.

RPiano: A Playable MIDI Synthesizer

by Samiksha Hiranandani time to read: 14 min