Projects Research & Design Hub

Cabinet Simulator “Stomp Box” for Guitarists

Written by Brian Millier

Using Teensy 4.1

Creating a good real-time simulation of a music speaker cabinet generally requires an FIR filter with a large number of taps—more than was possible with the Teensy 3.6. In this project article, Brian builds one using the Teensy 4.1 module, making use of its large memory, faster speed and added features.

  • How to build a real-time simulation of a music speaker cabinet

  • Why the Teensy 4.1 is an improvement for this project

  • How to implement the fast-convolution algorithm

  • How the software architecture works

  • What is the impact of the  latency issue?

  • How to design the hardware circuitry

  • Understanding the  input impedance

  • Visual enhancement using a TFT

  • Teensy 4.1 board from PJRC

  • NXP Semiconductor SGTL5000

  • Teensy Audio Library from PJRC

  • CMSIS DSP routines

  • NXP MIMXRT1062 Cortex M7 MCU

  • Audio System Design Tool from PJRC

  • Texas Instruments (TI) TLV75733 LDO regulator

  • Teensy Audio Adapter module from PJRC

  • Mathwork’s MATLAB

In my article “Fancy Filtering with the Teensy 3.6” (Circuit Cellar 346, May 2019) [1], I introduced convolution filtering, which can be used for guitar cabinet simulation, among many other things. By using FFT routines, you can convert a time-domain-based convolution filter into one that is implemented in the frequency domain. As I explained in that article, this allows the filtering to be performed much more quickly. For example, the Teensy 3.6 board that I used at the time could easily handle a 512-tap FIR filter in real time, without taxing the MCU too much.

When you are using an FIR filter for conventional low-pass band-pass filtering, a 512-tap filter ordinarily would be more than adequate. That is, it would be able to produce a very steep transition at its cut-off frequency (or frequencies in a band-pass configuration). It could also be designed for minimal amplitude ripple in its band-pass region, or high attenuation outside its band-pass region.

However, for guitar speaker cabinet simulation it turns out that getting a really good simulation generally requires an FIR filter with a much larger number of taps. Although the Teensy 3.6 board’s microcontroller (MCU) had enough “horsepower” to handle a filter with more than 512 taps, there are other complications. The larger the number of taps in the filter, the larger the RAM memory requirements are. When I started using the Teensy 3.6 board, its 256KB of SRAM seemed huge, but it turns out it’s not enough to handle FIR filters (implemented using floating-point FFT methods) a lot larger than 512 taps.


When the Teensy 4.0 board became available, it sported 1MB of SRAM, which would seem to solve the memory problem. It also ran at 600MHz (more than 3 times the speed of the MCU used on the Teensy 3.6), so it looked like execution time might not be an issue. However, another issue seemed to me to be unsolvable.

To play a guitar live, it is important that the time between a note being played and when it is emitted by the loudspeaker, be minimal. If you have ever tried to talk over the phone when there is an echo present, it is very difficult and disconcerting. It’s much worse if you are playing a guitar! The more skillful you are, the more critical this time delay or “latency” becomes. It’s generally accepted that a latency of <15ms is undetectable by the human ear, so this was the benchmark I was trying to achieve.

I’ll go into more detail later, but in simple terms, if we were aiming at an FIR tap size of 5,000, and our sampling rate were the standard CD rate (44,100 samples/second), it would take the amount of time to process the signal shown in equation (1):

This is based on the idea that if you are performing an FIR filter of “n” taps, you have to have “n” samples available with which to do the calculations.

Real-time guitar cabinet simulation can be performed using “plug-ins” for commercial Digital Audio Workstation applications on a PC. They don’t exhibit the large amount of latency shown in equation (1) when using large FIR tap sizes, so there had to be another way of doing this that I had not yet discovered.


I had moved on to other projects. But after writing my earlier article on this subject, Frank, the person with whom I had collaborated on that project, remained interested—mostly from the software-defined radio aspect of it. Frank is a ham radio operator. That’s why I only know his first name and call sign (DD4WH). He had come across some code written by Warren Pratt that implemented a uniformly partitioned, fast-convolution algorithm. This is done in the frequency domain using FFT (fast Fourier transform) and inverse FFT (iFFT) operations (as I did in my earlier article [1]).

It turns out that to do an “n” tap convolution, you needn’t wait until you have all “n” samples available before you start doing the number crunching. Instead, for example, to do a 4,096-tap convolution, you can break it into 8 partitions and perform a 512-point FFT on each partition. Alternately, you can abandon the 512-point FFT that I used in my earlier project [1], and use a larger number of partitions and a smaller 256-point FFT.

The 128-point partition size is not a random choice. The Teensy Audio Library deals exclusively with 128-sample blocks. That is, it buffers the incoming audio samples from the NXP Semiconductor SGTL5000’s ADC into 128-sample blocks, and uses DMA to move these blocks among the other buffers used by the various audio library functions. To provide steady, uninterrupted sound output, any Teensy Audio Library routine must accept a 128-sample block every 2.9ms, and output a processed 128-sample block at the same rate.

The idea of the uniformly partitioned convolution filter previously described sounds easy, and I implemented it in less than 100 lines of code in my library routine. However, that small size was largely due to my extensive use of CMSIS (Common Microcontroller Software Interface Standard) DSP routines, and it wasn’t easy to come up with that code. Frank spent a lot of time studying Warren Pratt’s code to adapt it to a form that would compile with the Teensy’s Arduino-based compiler. I played a small role in this. My contribution was in porting his code into a C++ Teensy Audio Library class, optimizing it a bit and performing some testing.

The Arm CMSIS DSP routines perform many different matrix-style operations, such as complex FFT, iFFT and time-domain convolution routines. These routines make heavy use of the Arm DSP-like instructions that are available on the NXP (formerly Freescale) MIMXRT1062 Cortex M7 MCU used on the Teensy board. The FFT and iFFT routines contain large tables (related to the size of the FFT you are performing) that are used as look-up tables to speed up the FFT operation. I believe these tables are used to eliminate the bit-twiddling operation (swapping all bits MSB to LSB) that is a part of the FFT routine. It would be impossible to independently write code for such routines that ran as fast as the CMSIS ones do!


Let’s take a look at how the uniformly partitioned convolution filter software works by referring to the block diagram in Figure 1. The actions in all the tan-colored blocks are performed using a single CMSIS function call. Before looking at what processing is repeatedly happening to the audio signals, let’s first look at the lower left section of the diagram. We have a finite impulse array that defines the filtering that we want to perform on the audio data.

Figure 1 Block diagram for the uniformly partitioned FIR filter routine used in this project. A lot of “magic” is going on under the hood here!
Figure 1
Block diagram for the uniformly partitioned FIR filter routine used in this project. A lot of “magic” is going on under the hood here!

This array is “n” taps long, and in the case of guitar cabinet simulation, is stored in a WAV file as an array of 16-bit integers. The “n” values are read in from a microSD card to an array. The values are then converted to floating-point using a single CMSIS call. The resulting array of “n” floating-point values is then sent to a complex FFT, and the resulting frequency domain values are what is called the “Filter Mask.” While it is one long, single-dimension array in memory, it is considered to be split into n/128 partitions.

Every 2.9ms, a 128-sample block of 16-bit audio data (for both Left and Right channels) is made available by the Teensy Audio Library. Since the CMSIS complex FFT routine expects both I (in-phase) and Q (quadrature) values, the Left and Right channels’ samples are used for I and Q, respectively, and are processed independently. For this project, only a monaural guitar signal is available, so the Q channel processing is basically wasted.

I suspect that substituting a real FFT for a complex one would be fine for a monaural signal, but any speed advantages would be negated, because, for some unknown reason, the CMSIS real FFT needs twice as much array memory space as a complex one. For this project, the Cortex M7 MCU is more than fast enough, but available SRAM memory limits the Filter Mask to about 22,000 taps, so this would not be a good trade-off.

The 128 16-bit integer samples are first converted to floating-point. I went into more detail in my previous article, but since we are processing discrete, 128-sample “chunks” of the signal at a time, we would get a time-aliased signal out of the convolution, if we did not compensate for this.

There are a few ways to do this, but in this case, we add 128 of the prior samples to the start of a 256-sample block, and append the 128 new samples to the second half of this block. At the end of the processing, we will similarly split up this 256-sample block into two parts: one to be saved for the next iteration of the routine, and one to be sent out to the DAC.

The 256-sample audio signal block is then converted into the frequency domain by a complex 256-point FFT. So, we now have both the Filter Mask and the incoming audio signal in the frequency domain. These frequency domain data blocks are stored in a circular buffer in memory, which implements a frequency domain delay line. The convolution process, if performed in the time domain, is a math-intensive process. Its execution time is linearly related to the number of taps in the Filter Mask. Even the very fast Cortex M7 MCU on the Teensy 4.1 board couldn’t handle time domain convolutions (in real time) using Filter Masks with the number of taps in the 2,000+ range. However, in the frequency domain, convolution is reduced to multiplication, and the execution time is now proportional to the log of the number of Filter Mask taps. The Teensy 4.1 probably has enough processing “horsepower” to handle Filter Masks of >50,000 taps. However, the Teensy 4.1 RAM memory is not large enough to accommodate the large arrays needed for such a large number of FIR filter taps.

The gray box in the center of the diagram in Figure 1 is labeled the Complex Multiply-Accumulate function. This is where the process gets complicated, and though I understand the basic math operations involved, I don’t fully understand how it accomplishes what it does!

Basically, the Multiply-Accumulate block does what its name implies: It performs a MAC operation between many different elements of both the Filter Mask array and the circular buffer (delay line) holding the input samples. The result of all of this calculation is a 256-sample block containing the filtered input signal—still in the frequency domain. An inverse FFT routine is called to convert this data back into the Time domain. One-half of this 256-sample array is saved for the next iteration of this routine (2.9ms later), and the other 128 samples are sent out to the DAC.

Not being a math whiz, I find this whole process truly amazing. We are effectively performing a convolution using a Filter Mask with a large number of taps, using only a 256-point FFT/iFFT. Yet, we are getting all the filtering effectiveness implied by the large size of the Filter Mask. And most important to this project, we are achieving a very low latency, which is critical, as mentioned earlier.


Let’s look at the latency issue more closely. I made some measurements using my ‘scope and a signal generator (Figure 2aFigure 2b and Figure 2c). The yellow trace is the signal from the function generator, and the purple is the SGTL5000’s DAC output. Note the rough start to the tone burst. I can’t gate the continuous wave output of my function generator, so I just set the ‘scope in the single-shot mode and gated the pulse by plugging in the BNC cable while the ‘scope was armed and waiting for a trigger. The three measurements shown are based on the following conditions:

Figure 2a: I patched my program to pass along the incoming 128-sample audio blocks to the SGTL5000 DAC with the convolution filter routine patched out. The latency here was 6.4ms. This is the minimum latency you could ever achieve, using the Teensy Audio Library with no processing blocks added.

Figure 2b: I restored the convolution processing and used the 512-tap minimum-phase, low-pass filter impulse file. The latency here was 6.88ms.

Figure 2c: I loaded a 22,500-sample impulse file. This impulse file was for a real guitar amplifier, and I could readily hear the difference in the sound when I fed a guitar signal into the unit. The latency here was 11.12ms.

Four factors make up these latency figures:

  1. The SGTL5000 contains a sigma-delta ADC, which needs digital filtering (contained within the SGTL5000, itself) to convert its over-sampled bitstream into the 16-bit samples it provides. The SGTL5000 datasheet is silent on this, but others have measured it to be about 15 ADC cycles or 0.3ms.
  2. In the Teensy Audio Library, the ADC signals are accumulated into a 128-sample buffer before they are processed. This results in a 2.9ms delay at 44,100 sample rate.
  3. The convolution routine, itself, takes only 1ms to execute, but I suspect that having the data ready before it’s needed for the next iteration means that a 2.9ms delay is involved here as well.
  4. The audio blocks going to the DAC are buffered in the SGTL5000 DAC driver routine, which gives an additional delay of 2.9ms.
Figure 2a The 6.4ms latency of the audio library with the FIR filter routine bypassed
Figure 2a
The 6.4ms latency of the audio library with the FIR filter routine bypassed
Figure 2b The 6.88ms latency when a 512-tap minimum-phase low-pass FIR filter is used
Figure 2b
The 6.88ms latency when a 512-tap minimum-phase low-pass FIR filter is used
Figure 2c The 11.12ms latency when a real 22,500-tap cabinet impulse file is being used
Figure 2c
The 11.12ms latency when a real 22,500-tap cabinet impulse file is being used

With the convolution filter processing patched out, you would expect a latency of 0.3 + 2.9 + 2.9 or 6.1ms. This is quite close to what you see in Figure 2a. Once you insert the convolution filter, the higher latency figures of 6.88ms and 11.2ms can be explained by looking at the graphs of the impulse files, themselves. For the example shown in Figure 2b, (512-tap minimum-phase LP filter), the impulse response file is shown in Figure 3. Here you can see that the first/dominant peak occurs at the 25th tap. This corresponds to a delay of: 25/44,100 = 0.56ms.

If you add this to the 6.4ms latency that is observed when the convolution filter is patched out, you arrive at 6.96ms, which is close to the 6.88ms figure measured by the ‘scope in Figure 2b. The same thing can be said of the 22,000-sample impulse file that I measured in Figure 2c.

As a side note, the 512-tap minimum-phase FIR coefficients shown in Figure 3 are not at all like the FIR coefficients that you will get from online FIR filter calculators (such as the TFilter Web-based calculator [2] that I mentioned in my prior article). Figure 4 shows a sample of the filter coefficients derived from the TFilter program. You can see that those coefficients are symmetrically arranged around the “n”/2 tap. If you were to use these non-minimum-phase FIR coefficients (from TFilter), you would get proportionally greater latencies as you increased the size of the Filter Mask.

I did not need low-pass/band-pass filters for this project; however, Frank discovered that to derive minimum-phase FIR coefficients (Figure 3) required the use of Mathwork’s MATLAB filter function. That program took quite a long calculation time when the number of FIR taps became larger (about 30 minutes if I recall correctly). In comparison, the Web-based TFilter program provides its results virtually instantaneously.

A few words regarding my uniformly partitioned convolution library are now in order. The convolution filter code consists of two files: filter_convolutionUP.h and filter_convolutionUP_cpp

These files must be added to the folder containing the Teensy Audio Library. This folder will be located under whatever folder you have installed the Arduino/Teensyduino IDE. The path is as follows:

c:\your arduino folder here\hardware\teensy\avr\libraries\Audio

Also, in that folder, edit Audio.h by adding the following line at the end:

#include “filter_convolutionUP.h” // library file added by Brian Millier

As with any custom audio library objects that you design yourself, this one will not show up in the Audio System Design Tool found on the PJRC website [3]. However, that does not mean that you cannot configure your program to insert it into the audio chain, using the same conventions that the Audio System Design Tool uses when it generates the code, itself. In my program, this configuration is done in lines 126-135. (See the Circuit Cellar article code and files webpage for this article’s code.)

Figure 3 Graph of the coefficients of the 512-tap minimum-phase low-pass filter used for Figure 2b
Figure 3
Graph of the coefficients of the 512-tap minimum-phase low-pass filter used for Figure 2b
Figure 4 A sample of a 50-tap non-minimum-phase FIR filter. A filter like this will incur more latency, because the dominant peak is in the center, rather than at the beginning, as with the one shown in Figure 3.
Figure 4
A sample of a 50-tap non-minimum-phase FIR filter. A filter like this will incur more latency, because the dominant peak is in the center, rather than at the beginning, as with the one shown in Figure 3.

Please refer to the schematic shown in Figure 5. The Teensy 4.1 module used for this project contains the NXP (formerly Freescale) MIMXRT1062 Arm Cortex M7 MCU running at 600MHz. This MCU contains 1MB of RAM storage, which is necessary to handle the long impulse files that I mentioned earlier. Unlike the Teensy 4.0 module used in my earlier project article, the Teensy 4.1 contains extra on-board features, some of which are used in this project:

  1. A microSD card socket, which is connected to the fast SDIO port on the MCU
  2. Footprints for either two 8MB PSRAM chips or one PSRAM device and one flash memory chip
  3. An 8MB QSPI flash memory device (providing much more program space than the 1MB available onboard the MCU)
  4. An Ethernet PHY (external breakout cable available)
  5. A USB Host port (external breakout cable available)
Figure 5 Schematic diagram of the project. Note that this wiring is for a Rev. C Audio Adapter board—the newer Rev. D boards are wired differently.
Figure 5
Schematic diagram of the project. Note that this wiring is for a Rev. C Audio Adapter board—the newer Rev. D boards are wired differently.

To maximize the size of the impulse files that the project would handle, I mounted one 8MB PSRAM chip on the board. Unfortunately, this QSPI PSRAM memory is not fast enough to be used for the various arrays needed by the real-time convolution routines. Accordingly, the size of the impulse files can’t become so large that they would need this PSRAM. However, I do use the PSRAM to store the files that are loaded in from the SD card, as well as for temporary arrays used when the integer values from the SD card are converted to floating-point for use by the convolution routine.

I use the SD card socket on the Teensy 4.1 to store the impulse files. It is connected to the MCU via a high speed SDIO port. The Teensy Audio Adapter [4] also contains an SD card socket, but it is interfaced via SPI, so it is a lot slower. Using the SDIO-connected SD card socket on the Teensy 4.1, the amount of time it takes to load a 22,500-sample impulse file (the maximum size this project will handle) is about 150ms. This is as fast as you can stomp on the switch to advance from one impulse to the next, so there is no noticeable delay.

The power for the Teensy 4.1 can be provided either by plugging a USB 5V adapter into the micro-USB socket on the Teensy 4.1 module, or supplying 5V to the VIN pin. I supply 5V using a USB 5V adapter. This 5V also feeds out through the Teensy’s VIN pin and that supplies 5V to the FET preamplifier through an LC filter.

There is a Texas Instruments (TI) TLV75733 LDO regulator on the Teensy 4.1. It provides a regulated 3.3V for the MIMXRT1062 MCU, and makes this 3.3V power available for external circuitry, using the “3.3V” pin. The Teensy Audio Adapter board is powered by this 3.3V supply.

All the audio signal handling is performed by the Teensy Audio Adapter module. This uses a SGTL5000 CODEC that contains the following:

  1. Programmable-gain input amplifiers for both the Stereo Line In port and a monaural microphone input
  2. A 2-channel sigma-delta ADC
  3. A 2-channel output DAC and headphone amplifier/volume control
  4. A dedicated audio DSP, which can perform basic audio functions

In this project, I make no use of the dedicated DSP, though I have used it successfully in earlier projects. The SGTL5000 datasheet doesn’t specifically mention how many bits of resolution the codec supports, but in any case, the Teensy Audio Library supports only 16-bits.

I have to admit that initially I thought I would try something new and use a discrete ADC and DAC in place of the Audio Adapter board. Almost all current devices are only available in tiny packages that I can’t handle. However, I settled on the TI PCM1808 24-bit stereo ADC and the Princeton Technology PT8211 16-bit DAC (which I have used on several earlier projects). I used SMT to DIP adapters on both of these devices.


Advertise Here

Unfortunately, I found that the noise level that I got using these two devices was more than I preferred. I was using a protoboard with reasonably sized ground and VCC tracks, and did the hand-wiring as carefully as I could. I guess that there was too much noise on the ground bus, or picked up from the MIMXRT1062, but I wasn’t able to reduce it enough for my liking. So, out came my solder sucker, and the ADC/DAC circuitry was replaced by the Teensy Audio Adapter board.


Although the Teensy Audio Adapter board contains most of the mixed-signal circuitry needed for the project, there is one issue. The Line Inputs have enough gain to handle a guitar’s signal properly; in fact, I set the input amplifier for a full-scale amplitude of 560mV. (Its most sensitive setting is 240mV.) However, the Line In input impedance is only 29kΩ. Most guitars, apart from those with an internal pre-amplifier, have a high source impedance. Their tone will be seriously degraded if the input impedance of the amplifier they are plugged into is less than 100kΩ. Commercial guitar amplifiers have an input impedance of 500kΩ to 1MΩ.

For this reason, I added Q1, an N-channel FET configured as a source follower. This has an input impedance of 470kΩ, which matches the guitar perfectly well. Its voltage gain is basically 1, but that is fine, given the 560mV full-scale sensitivity that I set for the Line Input. I used the 5V power supply to run this preamp, and placed an LC filter in series to minimize any noise coming in from the power supply.

The only disadvantage arising from the need for a high-impedance input circuit is that this FET preamp picks up some noise from the adjacent digital circuitry and the surrounding environment in general. It is not enough to be a big issue, since guitar pick-ups, themselves, are somewhat susceptible to stray electromagnetic noise.

I was planning on using the left channel Line Out for the output signal. As I was building/programing the unit, I simply had been plugging headphones into the Teensy Audio Adapter’s headphone output. This had provided an acceptable signal-to-noise ratio. However, when I tried to feed the Line Out signal into my studio mixer, I found significantly more noise. I attributed the difference between the two outputs to ground noise introduced into the Line Output signal.

The headphone output, in contrast, was a differential signal referenced to the HPVGND terminal, which sits at about 1.5V above ground. This output is free of any ground noise, so was quieter. To match the low-impedance (16Ω) headphone output with the much higher input impedance of the studio mixer (about 100kΩ), and to increase the signal level to the several hundred millivolt level, I added T1, a tiny MET-28 audio transformer. This transformer has a 50Ω input and a 1kΩ output, and multiplies the headphone output signal by a factor of 4.47:1. Just as importantly, the transformer also isolates that 1.5V common -mode voltage on the headphone output from the ground-referenced mixer or guitar amplifier input. As tiny as it is, I measured the MET-28-T’s frequency response to be virtually flat from 40Hz to well beyond 20,000Hz.

I’ll mention that my guitar is a Yamaha SE1203A, which contains a battery-operated internal amplifier. As a result, its output signal has a low source impedance. For my guitar, the FET preamp that I added to this circuit would not be necessary. However, guitars with such built-in preamps are uncommon, so I incorporated the FET preamp to make the circuit more widely useable.

I used a common 2.8” SPI-based color TFT touchscreen display for the unit. Initially I didn’t plan on having any graphics requirements, but I wanted to be able to display the impulse name in a medium-sized font, and also display an impulse number in a large (72pt) font that could be seen at a distance. The touchscreen display works fine for this, and is only slightly more expensive than common alphanumeric LCD displays, which are not particularly readable in the dark or at any distance.

I wanted to have a descriptive impulse name displayed on the TFT screen. However, the SD card library that I have used in the past could handle only the old DOS-standard 8.3 filenames. It wouldn’t be much use to display only the 8 characters that made up the primary file name. Therefore, I wired up the touchscreen controller and added a feature to the program that would allow the user to enter a descriptive name for each loaded impulse file, using the touchscreen for initial data entry. This descriptive name would be linked to the impulse file name, and stored in the MCU’s EEPROM space (which is emulated in flash memory).

Just after I finished doing this, I read on the PJRC (Teensy) Forum that the latest beta version of the Teensyduino Arduino add-in contained support for long filenames, as part of the new SD card library. Incorporating that feature seemed like a much better solution, since one could merely rename each impulse file with a longer file name that was more descriptive. That eliminated the need to store descriptive impulse names in EEPROM and the use of the touchscreen to enter those names.


I realized, after looking at the mostly empty TFT screen (just an impulse name and a large font impulse #), that it would be interesting to display a picture of the guitar amplifier (or speaker cabinet) that was being simulated. It’s pretty easy to find such images on the Internet, snip them off the screen and format them in a picture editor.

My Windows 10 snipper program saves its screen captures in jpg or png format. I bring this file into Corel PhotoPaint, and resample it so that it is no larger than 250×184 pixels—the space that I have allocated for the image on the TFT screen. Then I export it in bmp format, giving it the same primary filename as the impulse file that it represents (and a .bmp extension). Luckily, a routine in the Teensy library takes bmp files from an SD card and displays them on the same TFT display that I use in this project. I load these image files on the same SD card that the impulse files are stored on. These images require about 200ms to load/display.

To allow me to quickly sequence through the various impulse files (by repeatedly pressing the footswitch on the unit), I delay the display of this image for a few seconds. So, if you keep pressing the footswitch, each impulse will be loaded almost immediately (150ms delay maximum), and will not be slowed down by the image load/display time, since this image won’t show up until a few seconds after you have settled upon a particular impulse file. The unit’s display with a Fender amplifier is shown in Figure 6.

I should mention that the Teensy Audio Adapter is currently available in two versions, Rev. C and Rev. D. Originally, the C revision module was designed to piggy-back on top of the Teensy 3.2 module; that is, the pins matched the pinout of the Teensy 3.2. The Teensy 4.0/4.1 MCU modules have a somewhat similar pinout to that of the Teensy 3.2, but the I2S pins are different. This explains the two versions of the Teensy Adapter. Rev. C matches the pinout of the Teensy 3.2, and Rev. D matches the Teensy 4.0/4.1.

I don’t mount the Teensy Audio Adapter on top of the Teensy MCU module, because it makes it too bulky vertically. Since I am hand-wiring the two modules together, I’ve kept only Rev. C modules, and use them with whatever Teensy module I am using for the project. However, if you are using a Rev. C board with the much faster Teensy 4.0/4.1, you have to place a 100Ω resistor in series with the MCLK line, as shown in the Figure 5 schematic. Table 1 shows the wiring for each of the two revisions (sourced from the PJRC Teensy website).

Table 1 The wiring differences between the Rev. C and Rev. D Audio Adapter boards
Table 1
The wiring differences between the Rev. C and Rev. D Audio Adapter boards

The last thing I’ll mention about the SGTL5000 CODEC contained on the Teensy Audio Adapter, is that it requires an I2C connection to receive its configuration commands from the MCU. You won’t see the normal pull-up resistors on the SCL and SDA lines in the schematic, since there are 2.2kΩ resistors on the Teensy Audio Adapter module, itself. Figure 7 is a photo of the circuit board, and Figure 6 shows the completed unit in its enclosure.

Figure 6 The finished project. It looks trapezoidal, but that’s just an artifact of the angle at which I had to shoot the photo to eliminate glare.
Figure 6
The finished project. It looks trapezoidal, but that’s just an artifact of the angle at which I had to shoot the photo to eliminate glare.
Figure 7 The finished circuit board
Figure 7
The finished circuit board

I have built many music/audio-related projects with various members of the Teensy MCU family of boards. Apart from the powerful MCU that can now be found on the Teensy 4.x modules, the huge number of Arduino libraries that work with these modules is a major consideration. Also, the easy-to-use Teensy Audio Library contains the most comprehensive collection of audio functions that I have found among the four MCU product lines that I routinely use.

That said, I am still amazed that it is possible to do a 22,000-point FIR filter routine in real time on an Arm MCU with a latency in the 10-15ms range. This can be attributed to the fact that the routine is performed in the frequency domain, as mentioned earlier. However, the elegant coding found in the CMSIS DSP routines, together with the high usage of DMA data transfers in the Teensy Audio Library, also contribute to this high performance.

Although I designed this unit to perform only guitar-cabinet simulations, the Teensy MCU is nowhere near fully utilized with that task. The FFT-convolution routine, which is performed for each audio block (occurring every 2.9ms), takes a bit less than 1ms to execute, so about 60% of MCU capacity is free. Using other blocks in the Teensy Audio Library, it would be possible to add effects such as tremolo, reverb, chorus or even a guitar tuner (using the notefreq block).

The firmware for this project can be found on the Circuit Cellar’s article code and files webpage. Note that I used Arduino version 1.8.13 and Teensyduino Beta 1.54. These newest versions are needed to compile the code. My uniformly partitioned convolution filter library is available with the rest of the firmware on the article code and files webpage. It must be added to the Teensy Library as described earlier in the article. 



Advertise Here

[1} “Fancy Filtering with the Teensy 3.6” (Circuit Cellar 346, May 2019)
[2] TFilter- online FIR filter design:
[3] Teensy Audio Library and Audio System Design Tool:
[4] Teensy 4.1, Audio Adapter and TFT Display can all be found on the PJRC (Teensy) website:

MathWorks |
Microchip Technology |
NXP Semiconductors |
Princeton Technology |
Texas Instruments |


Keep up-to-date with our FREE Weekly Newsletter!

Don't miss out on upcoming issues of Circuit Cellar.

Note: We’ve made the Dec 2022 issue of Circuit Cellar available as a free sample issue. In it, you’ll find a rich variety of the kinds of articles and information that exemplify a typical issue of the current magazine.

Would you like to write for Circuit Cellar? We are always accepting articles/posts from the technical community. Get in touch with us and let's discuss your ideas.

Sponsor this Article
+ posts

Brian Millier runs Computer Interface Consultants. He was an instrumentation engineer in the Department of Chemistry at Dalhousie University (Halifax, NS, Canada) for 29 years.

Supporting Companies

Upcoming Events

Copyright © KCK Media Corp.
All Rights Reserved

Copyright © 2024 KCK Media Corp.

Cabinet Simulator “Stomp Box” for Guitarists

by Brian Millier time to read: 22 min