Home
Scholarly Works
DeepBSL: 3-D Personalized Deep Binaural Sound...
Journal article

DeepBSL: 3-D Personalized Deep Binaural Sound Localization on Earable Devices

Abstract

The prevalence of earable devices, such as earbuds and headphones, allows people to converse or listen to audio recordings on the move but often at the cost of reduced alertness of imminent health and safety threats in their surroundings. 3-D binaural sound localization (BSL), which aims to locate sound sources in space, plays a key role in improving one’s situation awareness. BSL on earable devices is inherently challenging due to the limited number of microphones available as well as subject- and location-dependent filtration effects of a person’s pinna, head and torso, described by head-related transfer functions (HRTFs). In this work, we develop DeepBSL, a deep neural network model to estimate azimuth and elevation angles of sound sources relative to a person’s head. For a new subject, an efficient procedure is developed to collect HRTFs at sparse locations using in-ear microphones and a mobile phone, which are then utilized to synthesize sounds of any type at arbitrary locations to train personalized DeepBSL models. Extensive evaluations using synthetic data from a public data set and through real-world experiments demonstrate that the personalized DeepBSL models are data-efficient and can achieve better-than-human performances in BSL while significantly outperforming a state-of-the-art model that can only predict azimuth angles of sound sources. Our best performing model has an average azimuth prediction error of $2.9^{\circ } (4.1^{\circ })$ and elevation prediction error of $1.4^{\circ } (2.9^{\circ })$ in an indoor (outdoor) environment.

Authors

El-Mohandes AM; Zandi NH; Zheng R

Journal

IEEE Internet of Things Journal, Vol. 10, No. 21, pp. 19004–19013

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Publication Date

January 1, 2023

DOI

10.1109/jiot.2023.3281128

ISSN

2327-4662

Contact the Experts team