Motivated by the tedious nature of the sheet music annotation process, these three Cornell students built a system designed as a music composition assistant for composers, arrangers and musicians at all levels. Called PICcompose, this is a PIC32 MCU-based, end-to-end tool that converts raw audio from playing an instrument directly into an editable sheet music score. In this article, they describe the design and components of the system.
As musicians, we know that composing music can be a time-consuming task. It can be almost as difficult to notate the melodies as it is to find inspiration for them. When brainstorming, a composer will often try ideas on an instrument and then manually translate these thoughts onto paper or software to generate the sheet music. This is why we created “PICcompose”—a tool that converts raw audio data into an editable music score. We aimed to eliminate a lot of the middle-ground work in the music composition process by creating a tool that generates sheet music after playing an instrument.
In our PIC32-driven, end-to-end, software-hardware hybrid solution, we extract frequencies from the audio source, and convert them into the corresponding note in MIDI format. We then determine the note timings, and send that data to an external computer. The computer compiles the data into a MIDI file, saves it and loads it into MuseScore , our group’s preferred music editing software. The use of MIDI, an industry-standard protocol for music notation, allows the output of our system to be loaded into any music-editing software. At the end of the project, we had a nicely packaged product that demonstrated our ability to accurately notate simple melodies played on a flute.
Apart from Microchip Technology’s PIC32 microcontroller (MCU), the primary hardware components of this project are a microphone, filtering and amplification circuitry, a keypad, a toggle switch, an LED and a thin-film transistor (TFT) liquid crystal display (LCD) (Figure 1).
The first portion of our hardware design was amplification and filtering. Raw audio is picked up by an Adafruit electret microphone. The signal is then sent through a non-inverting op amp with a gain of 50. This gain was chosen because it increases the signal amplitude to the range of hundreds of millivolts, which is recognizable by the PIC32’s analog-to-digital converter (ADC). After amplification, the signal is sent through an anti-aliasing, two-pole Sallen-Key filter. The Sallen-Key filter was implemented to have a cut-off frequency of 2.5kHz and a Q of 1. We chose the frequency of 2.5kHz because it meets the Nyquist requirement and is well above the note range of a flute.
A Fast Fourier Transform (FFT) is computed on the PIC32, which samples at 8kHz. The Nyquist requirement is met because 2.5kHz is less than half the sampling frequency, and there is no input energy from a flute at such high frequencies. After this amplification and filtering, the signal becomes the input to the PIC32’s ADC. The final amplification circuit is shown in the full schematic in Figure 2.
Most of our other hardware was used for the PICcompose user interface. The keypad allows the user to input a desired tempo. It is connected to GPIO pins on a port expander that interfaces with the PIC32 MCU. The toggle switch enables a user to transition easily between two different modes: An idle Tempo Input mode, in which the user inputs a tempo and the system is simply waiting, and an active Record mode, in which the melodies the user is playing are being digitally written. An LED blinks at the user-set tempo when PICcompose is in its Record mode, and this acts as a metronome for the musician. The user can look at the LED when playing an instrument to maintain the tempo.
In terms of construction, our hardware was divided between a solder board and a PCB. The microphone, amplifier and Sallen-Key circuits were constructed on a solder board. The PIC32, TFT and Microchip port expander were soldered onto a PCB designed by Sean Carroll . The port expander was used to allow us to interface more hardware items with the PIC32. Finally, the system had two output cables, one for power and one for the serial interface to the computer. A wall socket powered the PIC32, which then powered the solder-board circuitry at 3.3V. To create the UART serial interface for this product, we connected a computer to the PIC32 using the Adafruit USB-to-TTL connector, which transferred data to the laptop computer for MIDI file generation.
To bring it all together, a box was designed using Autodesk Fusion 360, and laser-cut to make our project more presentable. The box was designed to expose only the TFT screen, Record switch, LED, microphone and keypad. Our final product is shown in Figure 1.
Our software components involve a protothreads  C program running on the PIC32, and a Python script running on an external computer. The majority of the computation occurs on the PIC32 MCU. The multi-threaded program handles both the user interface and the audio processing, and sends note data to the external computer over a serial interface, where the Python script converts the incoming data into a MIDI file.
The user interface software functionalities include controlling the programmable LED metronome, the toggle switch and the keypad. A dedicated metronome thread blinks the LED at the user-set tempo. This thread simply toggles the LED and yields for a variable length of time, depending on the beats per minute (bpm) entered by the user. Another thread handles keypad inputs. Within this thread, we debounce and decode keypad presses, then execute the appropriate actions associated with each key press. The GPIO input from the toggle switch is simply used to set a variable that states whether the system is in Record mode. This occurs within a thread that runs every millisecond, so that the switch state is continuously checked.
To interpret the notes in our audio data, we run an FFT on the data coming into the ADC from the microphone circuit. A 1,024-point resolution allows us to distinguish between notes accurately in our frequency range of interest. Consecutive notes are at least 16Hz apart in the main range of a flute, and an 8kHz sampling rate gives us a resolution of 8Hz. We use a ramp window on the beginning and end of the time domain sample to avoid high-frequency aliasing caused by the start and stop of the FFT sampling.
From the FFT, we determine the dominant frequency in the audio signal, ignoring the first 15 frequency bins due to the presence of some large, low-frequency components. This gives us a frequency detection threshold of 117Hz, well below the frequency range of a flute. We then convert the note frequency to its MIDI number using the following equation, where m is the MIDI number and fm is the note frequency:
The MIDI note data are sent via serial communication to the external computer in a thread that runs every millisecond. While the system is in Record mode, this thread increments a millisecond counter, and whenever a note starts or stops, it sends a serial message over UART to the listening Python script. The message contains the note’s MIDI number and the millisecond timestamp, along with one of five keywords that denote the message type to the Python script.
PYTHON AND MIDI
We use a Python script to generate the actual MIDI files by listening for serial inputs using Prolific’s UART driver for Mac. We primarily used a library called “mido for MIDI parsing” and the pySerial library to read in the serial stream at a 38400 baud rate. A loop is used to monitor the serial stream for timing and note messages, and then goes through a series of cases depending on what is contained in the message.
The five valid serial messages begin with BPM, BEGIN, START, STOP and END. BPM is used to set the Tempo Input from the keypad, while BEGIN and END are used to define the recording interval.
The START and STOP messages perform similar functions to each other by writing MIDI bytes to the mido library’s MidiFile object, corresponding to when a note starts and stops. In the split message array, following the START or STOP keywords, two pieces of information are sent over by the PIC32: MIDI note, and global time in milliseconds. When writing out the command in MIDI bytes, the global time needs to be converted to a delta time, which MIDI defines using microseconds, or “ticks.” The PPQ (ticks per quarter note or “qn”) is stored as a default (480) for MIDI tracks created with mido. We can calculate both START and STOP delta times similarly using the following equation:
Essentially, we find the difference in milliseconds between the current note and the last note, convert that time to minutes, and then use the BPM (qn/minute) and PPQ (ticks/qn) to get the note length in ticks. We then write that number to the MidiFile object. Apart from note length determination, some additional filtering to improve accuracy is done in the loop.
When the script receives an END message, the recording stops and the MidiFile object is saved to a .mid file. The Python script then triggers a subprocess call that opens the MuseScore application on the computer, where a user can open the newly-created MIDI file containing the generated sheet music.
The TFT display involves a series of panels drawn on the TFT screen. The first panel is an intro screen displayed when the system is first powered on (Figure 3). The two panels in Figure 4 are example displays of the Tempo Input mode. The left panel is shown when a user has not yet entered a tempo. The right panel is shown once the user has set the tempo via the keypad. These panels also display the note played by the user, if the user is playing a note on an instrument. The note name and octave are determined from the MIDI number.
The final panel (Figure 5) is shown when the Record switch on the front of the box is flipped to “on.” Flipping it back off returns the user to the Tempo Input screen. In Record mode, the user-played note is displayed in a large font, and the tempo is shown in a small font. The bottom of this panel also displays “Recording.”
We successfully demonstrated a user’s audio getting notated and put into an editable score. You can see our project in action in our demo video below (Figure 6). As you can see in the video, we were able to notate simple scales and melodies with 100% accuracy in note frequency and length. Image captures from our notated scores are shown in Figure 7 and Figure 8.
Although the notations are accurate, the user still has the ability to edit the score, if desired. To edit, a user opens any desired notation software and loads the file produced by PICcompose. This ability to open the score across multiple platforms is attributable to our use of MIDI, a universal standard. We opened our scores in MuseScore3 , but users could choose to open their scores in other software, such as Sibelius  or Finale .
Currently, our project supports only quarter-note-level resolution—in other words, the shortest note we can notate is a quarter note. Because we can resolve quarter notes at very fast tempos, we believe that with some fine tuning of note resolution calculations, we could visually increase our scope to faster notes. Another design improvement would be to implement a Bluetooth connection between the PIC32 and the external computer. Although there is nothing inherently wrong with using a serial interface, a Bluetooth connection would provide a user with more convenience and a wireless experience. Overall, we had a great time working on this project, and enjoyed blending our interests in music composition, as well as embedded circuit development!
PUBLISHED IN CIRCUIT CELLAR MAGAZINE • SEPTEMBER 2020 #362 – Get a PDF of the issue