Digital Echolocation for the Visually Impaired
An assistive device that helps a visually impaired user sense and navigate through their surroundings with the help of directional audio.
Inspired by bats, whales, and dolphins, we have designed a device that enables users to safely navigate their surroundings with sound. AroundSound is a walking stick attachment that, when activated using a button, maps the distance and direction of objects in front of the user to “spatial sounds” — which, to the user, seem to be coming from the direction of the obstacle. The distance to the obstacle is mapped to pitch; the higher the frequency, the closer the object. The device helps the user mentally picture their environment and determine the safest direction to move forward (Figure 1).
We designed AroundSound as our final project for the “ECE 4760: Designing with Microcontrollers” course at Cornell University. We developed this project using a PIC32 microcontroller placed on a development board, with additional peripherals including an HC-SR04 ultrasonic sensor for distance measurement, an SG-90 mini servo mounted on a pan-tilt gimbal, and an audio jack with headphones for sound (Figure 2). The PIC32MX2 is a 32 bit microcontroller which operates on the MIPS instruction set and supports advanced peripheral support. In particular, the MCU supports 4 SPI/ I2S channels that are useful for audio processing and playback, and has input capture and output compare modules that support a wide range of sensing and motor functions. Since this is the platform used in the course, we also receive a development board that breaks out the functions of the pins of the MCU using a port expander, and provides a 12 bit, 2 channel DAC that is used to support either audio output or a TFT graphical display. The MCU is programmed using the MPLAB-X IDE, which simplifies source code compilation and provides tools to optimize the assembly program and improve system performance.
The hardware and software design decisions and the general operating principles of the three main subsystems of our prototype – servo movement, distance measurement, and sound production – are described in the subsequent sections of this article.
FORM FACTOR CONSIDERATIONS
For use as a personal navigation aid, the device needed to meet a few requirements. First, it had to be something that the user can keep physically close at all times, such that the device can almost “see” from the perspective of the user. The height of the sensor needed to be low enough such that it detected most of the obstacles that might come in the user’s way.
With these considerations, we designed the AroundSound as an attachment that fits on top of a walking stick or a staff, which is already a commonly used assistive product used by visually-impaired people. We placed the activate button on the stick such that it gives tactile feedback to the user about the orientation of the stick – the user should hold the stick such that the button faces the forward direction. A servo mounted on top of the stick rotates the ultrasonic distance sensor over a series of angles for data acquisition. A pan-tilt gimbal on the servo pans the sensor from -60° to 60° with 0° being the direction of the button (Figure 3).
We wired a button using a pull-up circuit to activate the system. The button is positioned directly away from the user, making it simple to understand the correct orientation for the cane. This is also the point we define as our system’s 0° position, which in conjunction with the servo’s set position of -60° at the beginning of each scan, ensures that the user can repeat readings of a static environment and receive reliable results.
We defined a thread that executes every 30 ms to check for a button press, without which the sound production and servo movement threads for the system cannot begin execution. The system proceeds to the thread activating the ultrasonic sensor and waits for 1 second to allow distance measurement and calculation from the sensor, but is not allowed to move forward without the button press condition being satisfied. If the button has been pressed, the recorded distance is mapped to frequency, and the amplitude in the left and right audio outputs is adjusted depending on the servo angle to provide a sense of direction to the sound produced. The servo angle is then allowed to increment by twenty degrees, yielding to the ultrasonic thread to perform the next distance measurement. The process repeats until seven measurements have been recorded and output to the user, after which the servo returns to its initial position. The time delay used by the ultrasonic sensor for measurement and calculation allows us to meet the timing constraint of the servo as well, since the minimum duration between two PWM control signals provided to the servo is expected to be 50 milliseconds.
The flow of events over different subsystems is shown in Figure 4.
We employ a SG-90 servo that operates at 3.3V, the same operational voltage as the microcontroller. In order to ensure control over the start and end of sensor data acquisition and servo movement, we added a button that initializes the sequence, starting the servo and sensor rotation from a fixed initial position.
We defined a range of -60° to 60° for actuating the sensor as shown in Figure 5. Combined with the selected sensor’s 15° field of vision, this gives us a total horizontal field of vision of 135°, which is similar to the angular range of the average human’s direct and peripheral visual field. The servo operates at a 50Hz pulse width modulated (PWM) frequency with between a 5% duty cycle (1ms square pulse) indicating a -90° position and a 10% duty cycle (2ms square pulse) indicating a +90° position. To correspond to the angular increments we wanted from our prototype, we calculated the duty cycles required for the start and end positions and scaled the increment in pulse duration so that it corresponded to increments of 20°. We use an output compare module on the PIC32 to sequentially generate square pulses of the desired length to control these positions.
The second key subsystem was distance measurement to objects around the user. In order to create a mapping of the user’s environment, the device needed to measure the distance to obstacles in the angular range in front of the users, up to a few meters away. We considered a few different sensors to perform this function, including a time-of-flight laser range sensor, a triangulation-based analog IR sensor, and the commonly available HC-SR04 ultrasonic sensor. The TOF was the most sophisticated out of the three, featuring high-resolution measurement and accuracy of +/- 1.5 cm, and it supported communication over UART or CAN. While we were able to communicate with the sensor, we couldn’t retrieve any usable distance data from it. The analog IR sensor from Sharp was much easier to configure and use through the microcontroller analog-to-digital converter (ADC). While the output voltage to the inverse distance graph of the sensor was expected to be linear according to the datasheet, after repeated testing, we found a non-linear error between the measured distance and the actual distance in the desired range. Finally, in the interest of limited time and resources, we decided to go with the HC-SR04 ultrasonic sensor due to its acceptable accuracy (1-2 cm) and reliability in measurement.
The HC-SR04 measures distance to an object by transmitting an ultrasonic sound pulse and measuring the time it takes for the echo to return to the sensor. It measures distances from 2 cm to 4 m with an accuracy of 3 mm. To operate the sensor with our microcontroller, we used an IO pin to send a 10 us pulse to the TRIG pin of the sensor, which triggered the sensor to send ultrasonic pulses. The sensor pulls up its ECHO output after sending the pulses, and only brings it back to low after the reflection of all the pulses has been received. The distance measurement by the sensor is determined by the duration for which the ECHO pin is kept high, and we used an input capture module on the microcontroller to measure this and derive the distance reading.
AroundSound utilizes the user’s inherent ability to localize sound, i.e., their ability to identify the location or origin of a detected sound in direction and distance, by producing spatial audio. Spatial audio technologies work over headphones to enable a user perceive sounds virtually coming from different specific directions in the three-dimensional space. Sophisticated systems like the Apple spatial audio also utilize head-tracking to adjust the virtual origin of sound as the user moves their head. In our system, we use two-dimensional spatial sounds originating from potential obstacles in the 2D plane of the device. Each time the user presses the button, seven distinct musical notes are produced signifying the distance of any obstacle along the direction of the sound location. Over the seven beeps, the virtual origin of sound is shifted over the field of view (-60° to 60°) from left to right.
These spatial beeps are produced based on the duplex theory of sound localization by the human auditory system. This theory states that, since human ears are on different sides of the head and thus have different coordinates in space, the distances between the sound source and ears are different. This results in time and intensity differences between the sound signals that reach the two ears called Interaural Time Difference (ITD) and Interaural Intensity Difference (IID), respectively.
ITD is a direct consequence of the difference in the distances that the sound wave propagates through in order to reach either ear (Figure 6). Sounds produced on the right side of the user’s head reach their right ear earlier than their left ear. Thus, the calculation of ITD factors in the angular position of the acoustic source, the head radius of the user, and the speed of sound. IID is highly frequency-dependent and is a consequence of the shadowing effect of the human head and ears (Figure 6). Sounds produced on the right side of the user’s head are louder to their right ear than their left ear. This is because the right ear and the head act as barriers between the left ear and the sound source, absorbing some of the sound waves. As described in Figure 6, the ITD and IID proportionality constants between the time delta and the amplitude of sound received on both ears depends on the lateral angle of the sound source, head radius (in ITD), and the sound frequency (in IID), which we determined experimentally for our use case. Based on these calculations, the sound waves for the farther ear are generated an ITD amount of time after those for the closer ear. Before playing the sound on the speakers, the sound wave for the farther ear is scaled down by the IID attenuation.
Spatial audio implemented as explained above enables the user to determine the direction of origin of the sound in the two-dimensional plane. The second aspect of audio in AroundSound is to enable proximity perception of objects in these directions. This is done by splitting the entire range of distances into smaller sub-ranges, and mapping each sub-range to a distinct frequency. For a pleasant sound experience, these sub-ranges are mapped to musical notes of different frequencies over multiple octaves. In our implementation, we started at the note C4 for the longest distance (2 meters) and went up to note B6 for the shortest distance (2 cm) resulting in 36 distinct notes, each mapped to a different sub-range. The resulting sub-ranges were 5.5 cm in length which was fine-grained enough for accuracy while at the same time slightly coarse-grained to not confuse any user. Higher frequencies (red region in the figure) are meant to intuitively warn the user that there is an object very close to them, while lower frequencies (green region in the figure) indicate safety as the object in that direction is farther away (Figure 7).
In order to produce the sounds, the PIC32 communicates with a 12-bit dual channel digital-to-analog converter (DAC). The basic beeps are implemented using the direct digital synthesis (DDS) algorithm to synthesize pure tones of a desired frequency. DDS turns the CPU-intensive sine operation to a constant-time operation with the help of sine-table lookups using an overflowing variable, thus allowing the PIC32 to produce high quality sounds at a high sampling rate (20 KHz). The DAC then converts this digital waveform into an analog waveform so that it can be heard through the speakers. In order to improve the sound quality, i.e., increase its accuracy and intelligibility, and make the beeps sound more pleasurable, we used Frequency Modulation (FM) synthesis to make the beeps sound like notes from a piano. FM synthesis can generate complex waveforms by using a sinusoid (operator) to modulate the frequency of another sinusoid (fundamental). This method was first developed at Stanford, and has since been popularized by Yamaha through their DX7 synthesizers.
In our system, we use a 1:3 ratio between the fundamental and the operator frequencies, both of which are generated using DDS. In order to give the sounds true piano note characteristics, we also applied an exponential function envelope to both the sinusoids with their respective attack and decay phases, as can be seen in Figure 8. The amplitude envelope provided a natural attack, sustain, and decay period resembling the pressing of a piano key. All of these calculations were optimized using linear approximations in order to maintain the sampling rate. At each timer interrupt occuring at the sampling rate, the digital input to the DAC is calculated with DDS on a pure note based on the object distance, which modulated using FM synthesis and these envelopes.
As a proof of concept, the AroundSound device accomplished our major goal of making an intuitive system that maps the surrounding physical environment to auditory notes for obstacle detection. From our tests and demonstrations, we found that AroundSound’s performance was extremely reliable. We set up multiple test environments consisting of movable segments of cardboard in addition to the static obstacles (walls, cabinets, etc.) present in a lab environment, and successfully navigated these environments while blindfolded. Nonetheless, there are some points of improvement that we would like to bring into consideration.
First, we noticed that ultrasound is sensitive to the reflecting surface. For soft, contorted surfaces (such as humans and clothes), the sensors do not provide any useful information, which could prove harmful to a person with visual impairment reliant on this device. A LIDAR or time-of-flight sensor (or a depth perception camera) will prove far more effective in detecting a wider range of obstacles. Second, we could consider alternate form factors that allow us to house the audio, sensing, and actuation circuitry in a single unit to refine the portability and visual appeal of the device. Third, we need to consider using an audio device that does not restrict the user’s hearing capabilities, especially for a visually-impaired user — bone conduction headphones could allow ambient noise to filter through while transmitting the 3D audio from AroundSound, ensuring that the user is able to pick up auditory cues from their surroundings.
In the current setup, AroundSound uses both speakers and headphones. Powered speakers, with an in-built sound amplifier, are plugged into the audio jack connected to the MCP4822 DAC on the PIC32. They convert the audio signals (electrical energy) into sound waves (mechanical wave energy) that can be heard by the user. The in-built sound amplifier provides gain to the input signal from the DAC to produce a stronger output signal. As it is, the output signal from the DAC is not strong enough to be conveniently perceived by the user, with a maximum output current of 25 milliamperes due to its high impedance. The headphones that we use do not have an in-built amplifier. Thus, we plug them into the speakers so that the user can hear the stronger amplified signal through them. The speakers, however, make the device immobile – which isn’t ideal for a walking stick attachment. Therefore, we plan to build an external amplifier circuit to perceive a strong audio signal. A single-supply, dual op-amp would boost the output current from the DAC so that we can directly plug-in headphones into the audio jack.
Finally, it is important to test the device with people with varying visual impairment in a dynamic environment in order to determine the usefulness and reliability of AroundSound. That will help us evaluate the project in more real scenarios and improve the device to better suit the needs of the target user.
ECE4760 Final Project: Sound Localization Assisted GPS Navigation. https://people.ece.cornell.edu/land/courses/ece4760/FinalProjects/f2015/ath68_sjy33_vwh7/ath68_sjy33_vwh7/ath68_sjy33_vwh7/SoundNavigation/SoundNavigation.htm. Accessed 18 Feb. 2022.
ECE 4760 Final Project: Auditory Navigator. https://people.ece.cornell.edu/land/courses/ece4760/FinalProjects/s2010/gp244_nva2_mrk99/gp244_nva2_mrk99/index.html. Accessed 18 Feb. 2022.
Aguerrevere, D., Choudhury, M. & Barreto, A. (2004). Portable 3D Sound / Sonar Navigation System for Blind Individuals. LACCET.
Gunther, R., Kazman, R. & MacGregor, C. (2004). Using 3D sound as a navigational aid in virtual environments. Behaviour & IT. 23. 435-446. 10.1080/01449290410001723364.
Schoop, E., Smith, J., & Hartmann, B. (2018, April). Hindsight: enhancing spatial awareness by sonifying detected objects in real-time 360-degree video. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (pp. 1-12).
Head-Related Transfer Functions – The CIPIC Interface Laboratory Home Page. https://www.ece.ucdavis.edu/cipic/spatial-sound/tutorial/hrtf/. Accessed 18 Feb. 2022.
Zhou X. (1996). Virtual reality technique. Telecommunications Science. 12(7): 46-–.
PIC 32 Hardware Manual. https://people.ece.cornell.edu/land/courses/ece4760/PIC32/index_Ref_Man.html
PIC32 Peripheral Libraries. http://ww1.microchip.com/downloads/en/DeviceDoc/32bitPeripheralLibraryGuide.pdf
PUBLISHED IN CIRCUIT CELLAR MAGAZINE • SEPTEMBER 2022 #386 – Get a PDF of the issueSponsor this Article
Amulya Khurana is a Masters of Engineering student studying Electrical and Computer Engineering at Cornell University, having completed her undergraduate degree in the same at Cornell University. She is passionate about computer architecture and chip design. She can be reached at email@example.com.
Krithik Ranjan is an undergraduate senior studying Electrical and Computer Engineering at Cornell University, and will be pursuing a PhD in Creative Technology and Design at the University of Colorado, Boulder. He is passionate about developing human-centered devices and human-computer interaction. He can be reached at firstname.lastname@example.org.
Aparajito Saha is an undergraduate senior studying Electrical and Computer Engineering and Computer Science at Cornell University, and will be working at Amazon Robotics as a software engineer after graduation. He is passionate about the applications of robotics and embedded devices. He can be reached at email@example.com.