Goal-driven, neurobiological-inspired convolutional neural network models of human spatial hearing

Kiki van der Heijden*, Siamak Mehrkanoon

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

The human brain effortlessly solves the complex computational task of sound localization using a mixture of spatial cues. How the brain performs this task in naturalistic listening environments (e.g. with reverberation) is not well understood. In the present paper, we build on the success of deep neural networks at solving complex and high-dimensional problems [1] to develop goal-driven, neurobiological-inspired convolutional neural network (CNN) models of human spatial hearing. After training, we visualize and quantify feature representations in intermediate layers to gain insights into the representational mechanisms underlying sound location encoding in CNNs. Our results show that neurobiological-inspired CNN models trained on real-life sounds spatialized with human binaural hearing characteristics can accurately predict sound location in the horizontal plane. CNN localization acuity across the azimuth resembles human sound localization acuity, but CNN models outperform human sound localization in the back. Training models with different objective functions - that is, minimizing either Euclidean or angular distance - modulates localization acuity in particular ways. Moreover, different implementations of binaural integration result in unique patterns of localization errors that resemble behavioral observations in humans. Finally, feature representations reveal a gradient of spatial selectivity across network layers, starting with broad spatial representations in early layers and progressing to sparse, highly selective spatial representations in deeper layers. In sum, our results show that neurobiological-inspired CNNs are a valid approach to modeling human spatial hearing. This work paves the way for future studies combining neural network models with empirical measurements of neural activity to unravel the complex computational mechanisms underlying neural sound location encoding in the human auditory pathway.

Original languageEnglish
Pages (from-to)432-442
Number of pages11
JournalNeurocomputing
Volume470
DOIs
Publication statusPublished - 22 Jan 2022

Bibliographical note

Funding Information:
Kiki van der Heijden received a B.A. in Cultural Sciences from Maastricht University (The Netherlands) in 2006, a M.A. in Media and Communications Management from Middlesex University (London, United Kingdom) in 2007, and a M.Sc. in Cognitive Neuroscience from Maastricht University (The Netherlands) in 2012. She conducted her Ph.D. research at Maastricht University and Georgetown University (United States) and was awarded the Ph.D. degree in 2017. After completing her Ph.D., she worked as a Post-Doctoral Research Fellow at the Cognitive Neuroscience Department at Maastricht University, and the Ear-, Nose and Throat (ENT) Department of the Maastricht University Medical Center. She is currently a Research Fellow at the Donders Institute at Radboud University (Nijmegen, Netherlands) and a Visiting Research Fellow at Columbia University (New York, United States). In her research, she uses an interdisciplinary approach combining cognitive neuroscience, computational modelling (focusing on deep neural network models) and clinical audiology to unravel the computational mechanisms underlying neural sound encoding in normal and hearing-impaired listeners. She has been awarded several research grants, including a Marie-Curie Individual Global Fellowship by the European Commission in 2019, and a NWO Veni by the Dutch Scientific Council (NWO) in 2020.

Funding Information:
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 898134.

Publisher Copyright:
© 2021 The Author(s)

Funding

Kiki van der Heijden received a B.A. in Cultural Sciences from Maastricht University (The Netherlands) in 2006, a M.A. in Media and Communications Management from Middlesex University (London, United Kingdom) in 2007, and a M.Sc. in Cognitive Neuroscience from Maastricht University (The Netherlands) in 2012. She conducted her Ph.D. research at Maastricht University and Georgetown University (United States) and was awarded the Ph.D. degree in 2017. After completing her Ph.D., she worked as a Post-Doctoral Research Fellow at the Cognitive Neuroscience Department at Maastricht University, and the Ear-, Nose and Throat (ENT) Department of the Maastricht University Medical Center. She is currently a Research Fellow at the Donders Institute at Radboud University (Nijmegen, Netherlands) and a Visiting Research Fellow at Columbia University (New York, United States). In her research, she uses an interdisciplinary approach combining cognitive neuroscience, computational modelling (focusing on deep neural network models) and clinical audiology to unravel the computational mechanisms underlying neural sound encoding in normal and hearing-impaired listeners. She has been awarded several research grants, including a Marie-Curie Individual Global Fellowship by the European Commission in 2019, and a NWO Veni by the Dutch Scientific Council (NWO) in 2020. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 898134.

Keywords

  • Binaural integration
  • Convolutional neural network
  • Deep learning
  • Human sound localization

Fingerprint

Dive into the research topics of 'Goal-driven, neurobiological-inspired convolutional neural network models of human spatial hearing'. Together they form a unique fingerprint.

Cite this