Projects Research & Design Hub

Relaxation Generator: Reloaded

Written by Brian Millier

Internet Era Upgrade

Some years ago, Brian wrote an article for Circuit Cellar about his project that generates relaxing sounds—ocean waves, rainfall and such—and inculpating a digital clock to shut off the sounds. At the time, he built it with only Atmel 8-bit AVR MCUs and support chips. In this article, Brian describes his more modern version of the project, this time built with an Espressif ESP32 MCU to provide Internet connectivity.

About 10 years ago, I published a Circuit Cellar article about a project I had designed that could generate relaxing sounds, such as ocean waves, brooks and rainfall. I ran this device at night for help falling asleep, and to mask out random outdoor noises that would wake up our dogs, whose barking would then wake us up. Incorporated in the project was a digital clock with an alarm feature that shut off those sounds.

The design for that project had to be more hardware-intensive 10 years back. At the time, I was using only Atmel 8-bit AVR microcontrollers (MCUs), and I had to choose a model that was close to top-of-the-line to get the functionality I needed (Atmel is now part of Microchip Technology). I also needed five other support chips to complete the design. I re-designed the project once—about 5 years ago—when I started using Arm MCUs. More recently, I decided to build a more modern version, with an Espressif ESP32 MCU to provide Internet connectivity. In this article, I describe this newest version of my project.

Because alarm clock functions were important to this project, I needed a real-time-clock (RTC) circuit of some sort to handle the time-keeping. Because power outages sometimes occur where I live, I wanted an RTC that would maintain the time through a power outage. In my first model, that function was handled by Maxim Integrated’s DS1307, and, in a later version, a Philips PCF8563. Both versions used a 3.3V coin cell as the battery backup. The design of the earlier models was such that powering the entire unit from a battery was not practical. Power supplies of 10V, 5V and 3.3V would be needed. Therefore, when a power failure occurred, the “Wave” sound would stop. If you have used one of these relaxation devices, you know—as we found—that once the sound stops, you quickly wake up. For the earlier versions, this was a shortcoming. At least the clock never needed to be reset, because the RTC chip was backed up by the coin cell.

This time around, I decided that I wanted the entire unit to be capable of running from a battery for 12 or more hours. That eliminated the need for a discrete RTC chip, since the ESP32 MCU can maintain the correct time completely in software—as long as it’s powered up. The newest version needs no manual setting of the clock, because the ESP32 connects to my home Wi-Fi router, and gets its time setting from an Internet-based Network Time Protocol (NTP) server.

The main reason I was able to power the whole project from a battery for an extended period partially stems from the choice of an extremely efficient digital audio amplifier module for this version. The earlier versions used a Class-B linear power amplifier (NXP Semiconductors’ TDA1517), which produced excellent quality sound but needed a 10V power supply and drew a significant amount of current.

One aspect of the project that didn’t change over the 10 years was how the sound files were stored. I wanted to have several different sounds available. These sounds are played repetitively in a “loop,” but you want each of them to have some variety over time, so they should be at least a few minutes long. It turns out that the sounds of brooks and ocean waves involve a significant amount of high audio frequencies, so I settled on the 16-bit/44kHz sampling rate (CD standard). Furthermore, I produce these sounds in stereo, with one speaker on a bedside table on each side of the bed. This gives a much more immersive sound.

It turns out that no low-cost, serial flash EEPROM devices are available that can handle the amount of data that these several files would contain. However, inexpensive SD cards are readily available. Even the lowest-capacity SD cards now available have much more storage capacity than is needed for even 25 such sound files. I chose an LCD display that contained an SD card socket, eliminating the cost and wiring of a separate SD card socket.

Although I have built devices that reproduced the popular, highly compressed MP3 file format, I did not consider this format here. That’s because it would require either an external MP3 decoder chip such as the VS1033, or a significant amount of processing by the ESP32 MCU. The ESP32 is capable of MP3 decoding, and software libraries are available. However, I didn’t see any advantage in using a compressed sound file format, given the huge amount of storage available on even the smallest SD card. The ESP32 has other tasks to perform in the project, and there was always a chance it would not be able to handle everything in real time, with no glitches in the sound output.

I chose the standard Microsoft .WAV file format, because it is well documented and easy to handle in software. Another advantage is that you can find “relaxing” nature sound files readily on the Internet, and these files are generally in the .WAV format. The .WAV format contains one or more sections of metadata in various “headers,” prior to the large block of data containing the actual waveform data. Although these headers contain useful information—such as the song name, data rate and the number of bits resolution—I don’t try to parse out this information from the header sections (called “chunks”).

The project is designed for a sample rate of 44,100Hz, 16-bit stereo data, and that is the format that the .WAV file(s) must be in for proper operation. Therefore, all I must look for in the file is the word “data.” Once I find that, the next 4 bytes make up a 32-bit number defining the length of the actual waveform data. I use this value to determine when I have reached the end of the waveform data. Immediately following the 4 file-length bytes are the actual data, and that is where I start reading the waveform data.

Figure 1 shows a hex dump of the beginning of an actual .WAV file that I use, with the “data” bytes circled in green. Although the bytes making up the string “data” are in the expected order, the following 4 bytes defining the file length, are in the big-Endian format, so you have to read them “backwards.” That is, the 0xA0423F08 value shown after “data” in Figure 1 equals 138,363,552 bytes. This file happened to be an hour long. In practice, one could use files that were only a few minutes long, as they are looped, and there is no “dead” (muted) time interval between the end of the file and when it starts back at the beginning.

FIGURE 1 – A hex dump of the beginning of a .WAV file. The start of the data “chunk” is marked by the ASCII string “data,” which I’ve circled in green.

One aspect of my earlier designs that wasn’t ideal was the clock display. Initially, I used a common 20×2-character LCD display with an LED backlight. It was easy to dim the LED backlight, so that it was not so bright as to disturb sleeping. However, as with all LCD character displays, the font was small and hard to read at any distance. I designed my own larger font using four adjacent character positions, so it was useable.

For the next version, I used an Adafruit 4-digit LED display module. I chose it because it contains its own controller chip and is interfaced via I2C. The Arm MCU module that I was using (a Teensy 3.2) did not have a lot of spare GPIO pins, so the two-wire I2C interface was essential. The controller on this module can set 16 different LED brightness levels (by adjusting the LED current). However, I found that even the lowest brightness level seemed too bright for my liking at night. Even placing a colored filter in front of the LED module didn’t dim it enough.


Advertise Here

For my latest version, I chose a common and inexpensive 2.8” color TFT display. An LCD display produces no actual light of its own, but merely filters/blocks the light emitted from its LED backlight. I control that backlight using a PWM (pulse width modulation) pin on the ESP32, so users have complete control over how dim they want the display to be. The software adjusts the backlight brightness, depending on whether it’s day or night. The controller library for this display contains the ability to use many different fonts/sizes, and I chose one that displays 0.5″-high characters, which are easily readable even when you’re half asleep!

One consideration that I initially overlooked, when choosing a graphic TFT LCD display, was the amount of time it would take to update the clock display. The TFT display is interfaced via SPI, and the ESP32 can handle high SPI data rates (40MHz). However, there is more to it than that. To simultaneously produce the relaxation sounds, the SD card (also an SPI device) must be accessed at a high enough rate to provide 176,400 data bytes per second. The SD card interface cannot handle the 40MHz SPI rate, however.

The waveform data must be transferred via the I2S bus to the two DACs that provide the 44.1KHz/16-bit stereo sound output. The DACs themselves have no internal buffers, so they must be fed data at a steady rate of 176,400 bytes/s, to produce “glitch-free” sound output. Therefore, the time it takes to update the TFT display must not interfere with the steady data flow needed for the sound output.

I found it interesting to note that on my 10-year-old version of this project, I was able to accomplish this audio streaming with an 8-bit ATMega644 MCU clocked at 20MHz, using a single interrupt service routine and some hand-coded assembly language. The 32 bit ESP32 MCU runs at 240MHz and contains two cores. Its I2S library routine uses DMA transfers. Even with all this MCU horsepower and DMA, it was tricky to accomplish the TFT clock display update, without introducing any glitches into the audio playback. More on this later in the “Software” section.

In my original version of this project 10 years ago, the Atmel ATmega644 MCU that I used was among the fastest 8-bit MCUs available. But it didn’t contain an I2S port. Neither did most general-purpose MCUs of the day. Therefore, I used a common MCP4822 SPI 2-channel 12-bit DAC, and followed it with a TDA1517 linear stereo power amplifier chip.

This time around I used two MAX98357 devices from Maxim Integrated. The MAX98357 contains an I2S 16-bit DAC and a Class D audio amplifier, capable of putting out 3.2W of power using only a 5V power source. I used two of these for stereo. This choice helped to reduce the overall power consumption to a level where four AA batteries could be used for backup power lasting for 12 hours or more.

These devices come in either a very tiny WLP (wafer-level packaging) package or a tiny TQFN (thin quad flat no leads) package. There is no way I can personally solder such a small device to a PCB by hand. Adafruit comes to the rescue again, by selling a breakout module containing a MAX98357 device. The price of the Adafruit module is very reasonable considering it would cost about half that price for the MAX98357 IC, alone.


Advertise Here

The MAX98357 requires three of the standard I2S signals: BCLK, LRCLK and DATA—but does not require the higher frequency MCLK signal normally needed by many other audio codecs, DACs and other devices. This is important. Although the ESP32 can produce the high-frequency MCLK signal, it can only route that signal to GPIO0, which may not be available in some project designs.

If you feed the same three I2S signals to both MAX98357 chips, how does each device know if it is the left or the right channel? This is handled in a clever way on these devices. There is a single analog input pin (SD_MODE) that determines in which of four modes it will run. In the case of the Adafruit module, there is a 1MΩ pull-up resistor connected to the SD_MODE pin and the MAX98357, itself, has an internal 100kΩ pull-down resistor. The four different modes available on the Adafruit breakout board can be achieved as shown in Table 1.

GAIN_SLOT configuration Gain (dB)
Connected to GND via 100kΩ resistor 15
Connected directly to GND 12
Unconnected 9
Connected to VDD 6
Connected to VDD via 100kΩ resister 3
TABLE 1 – The four different modes available on the Adafruit breakout board

The MAX98357 devices use BTL (bridge-tied load) outputs—that is, the two output pins are differentially-driven, and neither one should be connected to ground. This rules out the use of headphones with the MAX98357, since headphones generally have Left, Right and Common wires. Driving two separate speakers is fine, though. The last feature of the MAX98357 is the adjustable Gain pin. If you assume that the I2S digital signal being fed into the MAX98357 is at full scale (0dBV), the output signal level is: Output Signal Level (dBV) = 2.1dB + selected Amplifier Gain (dB).

The Amplifier Gain is determined by the configuration of the Gain Slot pin, labeled Gain on the Adafruit breakout board (Table 2). Regardless of the digital input signal and amplifier gain, the maximum voltage that the MAX98357 can put out is limited by the 5V suggested maximum VDD limit. Because of the BTL output configuration, the maximum signal output is 2 × 5V or 10VPP (minus small voltage drops from the internal MOSFET output drivers). According to the MAX98357 specs, the maximum power output with a 4Ω speaker is 3.2W, which corresponds to a peak-to-peak output signal level of 8.9V.


Advertise Here

SD_Mode status External Resistor Selected Channel
HIGH 0Ωto VIN Left
Pull-up through RSMALL 470kΩ to VIN Right
Pull-up through RLARGE floating (Left + Right)/2
LOW 0Ωto GND Shut down
TABLE 2 – GAIN_SLOT configurations on the Adafruit breakout board

Using a full-scale I2S digital input and the maximum gain of 15dB, the output signal level would be 2.1 + 15 = 17.2dBV. This corresponds to 7.24VRMS or 20.5V peak-to-peak, which is more than double the maximum output voltage, so a lot of distortion would occur. Clearly, the 12dB and 15dB gains are meant to be used only when the I2S digital input signals are much less than the digital, full-scale values.

I needed to have a volume control in the unit. However, since the signal chain is digital all the way to the speakers, the only way to accomplish this is in the software. The 16-bit digital waveform values coming from the SD card’s .WAV file must be divided by some constant, which is derived from the volume control’s setting. The 10kΩ volume control in the project is fed from a 2.5V reference IC through a 15kΩ resistor, which places 1V across it. The ESP32’s internal ADC has a voltage reference of 1.0V. Using the ADC to measure the pot position, the wiper’s voltage will span the entire ADC input range. I use the 8-bit ADC value to determine the constant mentioned above.

I must admit I didn’t look too closely at the calculations shown in the previous MAX98357 DAC section, before I decided to go with the MAX98357 digital amplifier modules for this project. I had them on hand and had used them for an earlier project. In that project, I was pleasantly surprised that each MAX98357 could drive an older hi-fi loudspeaker cabinet with a 12″ woofer (and tweeter), adequately filling a 250ft2 room.

For this project, I was using only two small speaker enclosures with 5” woofers. More importantly, the sound levels needed would be much lower, since you are trying to sleep while the unit is operating. Therefore, I wired the Gain_Slot pins for the minimum 3dB gain setting. At this low gain setting, the signal output level (with an F.S. digital input) would be 5.1dBV (1.8VRMS) or about 0.8W (double for two channels). It turned out that significantly less power per speaker was needed for comfortable audio levels.

The digital volume control is working with 16-bit integer waveforms. I reduce the default amplitude of the 16-bit waveform by multiplying it by some value in the range of 1 to 255, based upon the setting of the volume pot. Then I divide this value 256, by arithmetically shifting the number left eight times. For the amount of attenuation that I found was needed to produce a reasonable sound level at night, this works fine and doesn’t reduce the resolution of the audio waveform enough to be objectionable.

In hindsight, I realize that I could have made a better design choice for audio output. Given the small amount of audio power actually needed, I could have used a circuit like the one shown in Figure 2. That would have eliminated the need for software control of the volume, which would have eased some of the software timing constraints I had to handle. I have numerous PT8211 DACs on hand. I had to buy ten at about $1 each. However, they are not readily available through normal USA distributers. Also, the TP2012D2 Class D amplifier could have been replaced by two Texas Instruments (TI) LM386 linear power amplifier ICs. Even with a VCC of only 5V, they would have put out enough audio power, and not used a whole lot of current.

FIGURE 2 – This block diagram shows what would have been a better audio output circuit than the one I had chosen.

Figure 3 is the overall circuit diagram. The ESP32 chip and supporting components/antenna are mounted on what Espressif calls the ESP32 DevKitC module. Espressif first produced these, and still sells an updated version. Not all the DevKitC modules use the same pin layout as what I show in the diagram. My module has male pins mounted on the bottom of the PC board, as shown in Figure 4.

FIGURE 3 – Schematic of the complete unit. Note that the PJRC color TFT display needs to have three resistors jumpered out, in order for the SD card socket to work reliably.

FIGURE 4 – Photo of the ESP32 DevKitC. The module I used has its male pins on the bottom of the PCB. Newer versions by Espressif have female headers mounted on the top of the board.

For some reason, the newer Espressif DevKitC modules have female headers mounted on the top of the PC board. That would mean, for example, that I couldn’t swap in the newer model into my project because the pins would all be flipped 180 degrees. I don’t know how you are expected to use the Boot and EN buttons on these newer boards, because they would be covered up by the module when it was plugged in.

The ESP32 DevKitC is programmed via its micro USB port, using the built-in serial bootloader. I do my ESP32 software development using the Arduino IDE, loaded with the ESP32 boards package. With the Arduino IDE, downloading ESP32 program code to the DevKitC is simple: the “normal” ESP32 requirement of depressing the Boot button, while toggling the EN button on/off to download code is unnecessary. This is handled by the DTR and RTS handshake signals coming from the Silicon Labs’ 2102 USB/serial bridge device on-board the DevKitC module. The Arduino ESP32 downloading tool toggles the DTR/RTS lines properly, whereas other ESP32 downloading applications may or may not do this.

One issue with some of the ESP32 DevKitC modules I have used concerns the power-up reset. During project development, I kept the DevKitC plugged into my PC’s USB port for power and for programming purposes. Connected this way, the 2102 USB/serial bridge will assert the DTR/RTS signals, so that the ESP32 will reset properly and start program execution as soon as the DevKitC board is enumerated by the PC as a valid USB com port device. However, the ESP32 would not execute a normal power-up reset when I tried to power the project using any of the following setups:

1) A USB power adapter plugged into the DevKitC USB socket
2) A battery pack consisting of three AA cells, feeding the VIN pin
3) A 3.7V LiPo battery feeding the VIN pin

The DevKitC module has a 3.3V LDO regulator on board to power the ESP32, so a battery supply to the Vin pin will work properly if the battery is greater than 3.3V (and ideally not much more than 6V). When using either of the two different battery sources, the ESP32’s VIN supply voltage should have risen to full value immediately, except for some delay charging the two 1,000µF capacitors (C2 and C3), which act as reservoirs for the two MAX98357 amplifier modules. In the case of the USB power adapter, the 5V would have risen somewhat slower than either of the batteries would have.

In all the aforementioned three cases, it appeared that the power to the ESP32 was not coming up to specs quickly enough for a proper ESP32 reset to occur. I temporarily removed the two 1,000µF capacitors, but that didn’t help. After consulting ESP32 forums, I ran across this issue in several posts. I eventually solved it by adding a 1µF capacitor (C4) to the ESP32’s EN (reset) line. Note that the VIN pin is actually labeled “5V” on the DevKitC, though it needn’t be a regulated 5V, as noted earlier.

I mentioned before that it was tricky to stream the audio data from the SD card to the I2S DACs, while also updating the TFT display without introducing audio glitches. Both the TFT display and the SD card interface use an SPI interface. The TFT display can handle SPI transfers at the 40MHz maximum SPI clock rate that the ESP32 can put out. Even at this high rate, I measured the display update time at 9.6ms, and that only involved updating the current time using five large-font characters.

Many fancy fonts are available with this library, but they require you to “erase” the screen area “under” the characters when refreshing the display, or you will just add the new character’s pixels on top of the old character, resulting in an unreadable display. Therefore, I chose the most primitive “block” font, since it did not need erasing and thus updated more quickly.

The SD card’s SPI interface can’t handle the 40MHz SPI rate. In fact, the Arduino SD card library runs at an SPI rate of only 4MHz. Most of the current Arduino libraries for SPI peripherals use what is called a “transactional” approach. That is, any library functions that directly perform SPI transfers will set up the SPI port for the proper SPI mode and clock rate parameters (as configured for that peripheral), prior to sending each SPI message. Therefore, if there is more than one device sharing the SPI bus, the SPI port will be properly configured for each peripheral in advance as it is accessed. This was a big advance for Arduino SPI libraries, in such cases. But it does slow things down a bit.

For this project, I decided to use both SPI ports available on the ESP32—a third is dedicated to the DevKitC’s flash memory device. The TFT display is driven by the ESP32’s HSPI port and the SD card is driven by the VSPI port, which is the default SPI port used by most ESP32 libraries that use SPI. I didn’t delve deeply into either the SD card or the TFT display libraries enough to know if using both ports was any faster than using only one SPI port. But early in my coding, I was experiencing audio glitches until I got the code optimized, so it was worth doing it this way.

Both MAX98357 DACs use the I2S port. This is a synchronous serial protocol that requires 4 bytes of audio data to be sent to the DAC at the chosen sample rate (44,100Hz). This data transfer must be a steady flow. There is no buffer inside the DAC to handle data, if it were to be sent in a burst mode. Luckily, Espressif has written a DMA-driven I2S library that handles this task. Since it is DMA-driven, it acts in the background, and other program code, such as fetching the next sector of data from the SD card, can operate concurrently.

The display I used is a 2.8” TFT touchscreen display with a resolution of 320 × 240 pixels. As just mentioned, it uses an SPI interface that can handle the high speeds put out by the ESP32’s HSPI port. I generally get these displays from, and they work very well. I recently got one of them from another source, and while it worked, it was too dim for normal use. I decided to use that one in this project, as I need a dim display for nighttime use anyway.

While this display includes a touchscreen, I didn’t use that feature. I know that the touchscreen and the associated XPT2046_touchscreen Teensy library from PJRC work well. I find using touch on such a small display to be awkward, so I decided to use three switches and a rotary encoder for the user interface. This TFT display includes a standard-sized SD card socket. I had seen posts on forums claiming that the SD card interface on this display didn’t work. It turns out that there are three resistors (R1,2,3) on the board that must be jumpered (shorted out). With that taken care of, the SD card interface worked fine, using the ESP32’s VSPI port.

To dim the display at night, I used a PWM output from the ESP32 to feed Q1, a PNP transistor. This provided a PWM-controlled current to the display’s LED backlight. The ESP32 contains a very sophisticated “LEDC” controller. It can generate up to 16 PWM signals on user-defined GPIO pins, completely in hardware. In my article “Exploring the ESP32’s Peripheral Blocks” in Circuit Cellar 332 (March 2018), I discussed this peripheral in detail, along with a few other unique ESP32 peripherals. Today there are high-level library routines available to configure these peripherals, but when I wrote that article, they weren’t available, so I wrote my own routines.

The project is normally powered by a USB power adapter capable of at least 500mA, which is plugged directly into the micro USB socket on the DevKitC. For battery backup, I decided to use three AA cells instead of a LiPo battery. Because power failures are infrequent, I assumed the battery backup would be used only sporadically. The shelf life of alkaline batteries approaches 10 years, so they wouldn’t have to be checked often. Three fresh AA cells will put out 4.8V. I placed a Schottky diode in series with the positive battery supply lead to prevent current from the 5V USB power module from entering the battery chain.

The project uses about 100mA when it is not playing any sound, and about 150mA when it is playing sound. This varies somewhat depending upon what level of dimming you apply to the TFT display. During the day, it contacts an NTP server once every hour to synchronize the time. This is a bit of overkill, and could be reduced to once per day without affecting anything. During synchronization, the current will increase, with short spikes of around 250mA for up to 15 seconds while the ESP32’s Wi-Fi circuitry is operating. The AA alkaline cells are rated around 2,400mA-hours so they should last for 12 or more hours. Figure 5 is a rear-view photo of the project in its case. The AA cells are not visible, because they are mounted on the back cover.

FIGURE 5 – Photo of the unit from the back in its cabinet. The four AA cells are not shown, because they are mounted on the back panel.

The protoboard I used here is a specialty board meant to mount on top of a Raspberry Pi. I had previously mounted the two MAX98357 DAC/amplifier modules on this board for a Raspberry Pi project that I had decided not to pursue. The finished unit is shown in Figure 6.

FIGURE 6 – Shown here is the completed unit mounted in a small enclosure I made from walnut.

When I switched to using the Arduino IDE from Bascom-AVR for my AVR projects, it was mainly because of the wealth of libraries available from thousands of Arduino enthusiasts. It turned out to be a wise choice, since this IDE has been expanded to handle many different Arm MCUs, of which I use Teensy 3.x and 4.0. It also handles the ESP8266/ESP32—which are not even Arm-based, but rather Tensilica Xtensa. (Tensilica is part of Cadence Design Systems.) Currently, I am using Visual Micro, an add-on to Microsoft’s Visual Studio. This VM/VS combination acts as a “wrapper” around the Arduino C++ toolchain, and provides a much better programming environment.

For this project, several critical libraries were needed to handle the task, all of which would have been difficult to develop on one’s own:

1) The SD card library to read the sound data files from the SD card
2) The TFT graphic library for the display
3) The I2S DMA-driven library to feed the DACs
4) The NTP library to set/synchronize the ESP32’s software-driven RTC with network time

The Arduino SD card library was originally written for AVR MCUs, but when you add the ESP32 board package to the Arduino IDE, you get a custom SD card library written by Espressif. The TFT touchscreen display uses an ILI9341 controller chip. Adafruit originally wrote the Adafruit_ILI9341 library for the AVR family, and it has been customized more recently to handle Teensy, ESP8266/ESP32 MCUs. It calls the Adafruit_GFX library for its graphics features. Important Note: My program uses the ESP32’s HSPI port for the TFT display’s SPI access, whereas the Adafruit ILI9341 library uses the VSPI port by default. This change is handled in my code as follows:


And in setup()

tft.begin(0, SPI2);

The above code works fine with version 1.1.0 of the Adafruit_ILI9341 library, but it won’t compile with later versions, because they have changed something. You must use the Arduino “Sketch – > Include Library – > Manage Libraries” function to load this older version of the library, if that is not the one you are currently using. The I2S DMA-driven library is written by Espressif. They use a certain style for their libraries, which differs from many other Arduino libraries. The Espressif libraries operate under the free RTOS operating system, and the I2S DMA-driven library needs the following included files:

#include “driver/i2s.h”
#include “freertos/queue.h”

The I2S port is configured by filling up the following two structures:

i2s_config_t i2s_config = {
.mode = (i2s_mode_t)(I2S_MODE_MASTER | I2S_MODE_TX),
.sample_rate = 44100,
.bits_per_sample = (i2s_bits_per_sample_t) 16, //I2S_BITS_PER_SAMPLE_16BIT
.channel_format = I2S_CHANNEL_FMT_RIGHT_LEFT,
.communication_format = (i2s_comm_format_t) (I2S_COMM_FORMAT_I2S | I2S_COMM_FORMAT_I2S_MSB),
.intr_alloc_flags = ESP_INTR_FLAG_LEVEL1, // high interrupt priority
.dma_buf_count = 8,
.dma_buf_len = 64, //Interrupt level 1
.use_apll = (int) 1

i2s_pin_config_t pin_config = {
.bck_io_num = 26, //this is BCK pin
.ws_io_num = 25, // this is LRCK pin
.data_out_num = 27, // this is DATA output pin
.data_in_num = -1 //Not used

The I2S port is started up as follows:

i2s_driver_install((i2s_port_t)i2s_num, &i2s_config, 0, NULL);
i2s_set_pin((i2s_port_t)i2s_num, &pin_config);

Since the I2S port is DMA-driven, when you want the sound to stop, it is not enough to just stop filling the DMA buffers. If that’s all you do, you’ll get a constant buzz coming from the speakers. You must add the following line to de-activate the DMA driver:


As far as the NTP time synchronization is concerned, I had already done a few earlier ESP8266/ESP32 projects that needed NTP time synchronization. I didn’t use any library, instead adding all the code needed to do the initial UDP request and handle the NTP reply. I set the timeserver string variable to URL and let the ESP32 resolve the IP# by using:

WiFi.hostByName(timeServer, timeServerIP);

I thought I’d be clever this time and use a higher-level NTP library that I found on GitHub, which seemed simple to use. Basically, you just call this routine and pass it your wireless router’s SSID/Password, and it does everything necessary to synchronize the ESP32’s software RTC. During the many hours spent developing/programming this project, I found that this NTP library routine didn’t always work, and ultimately it failed to work at all.

Examining the library code, I saw that it used a fixed IP# for the NTP server. From past experience, I knew that the IP#s of these servers change from time to time, and the method described in the last paragraph was more reliable. So, I reverted to using my own, “non-library” code, and it has worked well so far. I will say that you do have to wait a bit after sending an NTP request for the response to come back (if it’s going to), and you also must allow for a number of retries if you want to be sure of getting a valid NTP synchronization.

The user interface is quite simple. The first time that the ESP32 is powered up after the project code has been loaded, it will check out the first two bytes of EEPROM for the “0x55, 0xAA” signature. Since the user hasn’t configured the project yet, these two EEPROM bytes will instead be in the default erased state. The program will then call the configuration routine where it will ask for:

1) The desired Alarm time
2) The sound file # (from a list of sound file names on the display)
3) The local time offset from UTC. I don’t specifically handle Daylight Saving Time changeover in software, so you must change this offset twice a year on the day that the time changes.

These parameters will then be saved to EEPROM, where they will remain intact if the unit is powered down by removing both AC power and the battery.

The user interface consists of the following:

1) Three push buttons
2) Menu: To select the configuration menus listed above
3) Enter: to enter the value of the parameter being modified
4) Play: To start/stop the playing of the relaxation sound. Once started, this will continue to play until the alarm time is reached, or the user hits Play again.
5) A rotary encoder to adjust parameter values
6) A potentiometer to adjust volume

An SD card must be inserted into the SD card socket containing at least one sound file in the .WAV format. When using SD cards in an MCU-based project, it is always good practice to format the SD card using the “SDFormatter” PC application.

I’ve now built three versions of this device over 10+ years, all of which worked well. Since I use it every night, it is one of my projects that, in addition to being interesting to build/program, I also use repeatedly. That’s my justification for spending the additional time designing the newer models. I haven’t mentioned that one of my original goals in building this newest version was to incorporate a GPS module. This module would:

1) Provide an accurate time (in place of the external, Web-based NTP server)
2) Provide a “local” NTP server that could be used by several other IoT devices I’ve built for my home, all of which have an ongoing need for the correct time/date. Currently they use the same “external” Web-based NTP server that this project does.

I was stymied by this part of the project. An older GPS module that I had in my “spare” parts bin turned out to be dead. I ordered a GPS board from Amazon that contained the common U-blox Neo-6M module and a tiny antenna with 3” of coax cable. While I was able to see a lot of NMEA messages spewing out of it, it only rarely would get the proper “fix” on enough satellites to provide the time, never mind my location. I gave up trying out the unit in a window with a good “view” of the sky and took the whole thing outdoors. Even then it worked horribly. So, I returned it for a refund, and decided to abandon that part of the project. I had planned on placing this project a few feet from a large window, and I doubt that the inexpensive GPS modules that I was looking at would have worked properly.

This failed experiment makes me suspicious when I see TV shows where someone hides a “GPS” tracker underneath a car. What kind of great antenna must those trackers use? 

For detailed article references and additional resources go to:

Adafruit |
Cadence Design Systems |
Espressif Systems |
Maxim Integrated |
Microchip Technology |
NXP Semiconductors |
Silicon Labs |
Texas Instruments |
U-blox |


Keep up-to-date with our FREE Weekly Newsletter!

Don't miss out on upcoming issues of Circuit Cellar.

Note: We’ve made the Dec 2022 issue of Circuit Cellar available as a free sample issue. In it, you’ll find a rich variety of the kinds of articles and information that exemplify a typical issue of the current magazine.

Would you like to write for Circuit Cellar? We are always accepting articles/posts from the technical community. Get in touch with us and let's discuss your ideas.

Sponsor this Article
+ posts

Brian Millier runs Computer Interface Consultants. He was an instrumentation engineer in the Department of Chemistry at Dalhousie University (Halifax, NS, Canada) for 29 years.

Supporting Companies

Upcoming Events

Copyright © KCK Media Corp.
All Rights Reserved

Copyright © 2024 KCK Media Corp.

Relaxation Generator: Reloaded

by Brian Millier time to read: 26 min