[HTML][HTML] A survey of sound source localization with deep learning methods

PA Grumiaux, S Kitić, L Girin, A Guérin - The Journal of the Acoustical …, 2022 - pubs.aip.org
This article is a survey of deep learning methods for single and multiple sound source
localization, with a focus on sound source localization in indoor environments, where …

[HTML][HTML] Machine learning in acoustics: Theory and applications

MJ Bianco, P Gerstoft, J Traer, E Ozanich… - The Journal of the …, 2019 - pubs.aip.org
Acoustic data provide scientific and engineering insights in fields ranging from biology and
communications to ocean and Earth science. We survey the recent advances and …

Fullsubnet: A full-band and sub-band fusion model for real-time single-channel speech enhancement

X Hao, X Su, R Horaud, X Li - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
This paper proposes a full-band and sub-band fusion model, named as FullSubNet, for
single-channel real-time speech enhancement. Full-band and sub-band refer to the models …

Multi-speaker DOA estimation using deep convolutional networks trained with noise signals

S Chakrabarty, EAP Habets - IEEE Journal of Selected Topics …, 2019 - ieeexplore.ieee.org
Supervised learning-based methods for source localization, being data driven, can be
adapted to different acoustic conditions via training and have been shown to be robust to …

A consolidated perspective on multimicrophone speech enhancement and source separation

S Gannot, E Vincent… - … /ACM Transactions on …, 2017 - ieeexplore.ieee.org
Speech enhancement and separation are core problems in audio signal processing, with
commercial applications in devices as diverse as mobile phones, conference call systems …

Broadband DOA estimation using convolutional neural networks trained with noise signals

S Chakrabarty, EAP Habets - … of Signal Processing to Audio and …, 2017 - ieeexplore.ieee.org
A convolution neural network (CNN) based classification method for broadband DOA
estimation is proposed, where the phase component of the short-time Fourier transform …

Speech processing for digital home assistants: Combining signal processing with deep-learning techniques

R Haeb-Umbach, S Watanabe… - IEEE Signal …, 2019 - ieeexplore.ieee.org
Once a popular theme of futuristic science fiction or far-fetched technology forecasts, digital
home assistants with a spoken language interface have become a ubiquitous commodity …

Combining spectral and spatial features for deep learning based blind speaker separation

ZQ Wang, DL Wang - … ACM Transactions on audio, speech, and …, 2018 - ieeexplore.ieee.org
This study tightly integrates complementary spectral and spatial features for deep learning
based multi-channel speaker separation in reverberant environments. The key idea is to …

Real acoustic fields: An audio-visual room acoustics dataset and benchmark

Z Chen, ID Gebru, C Richardt… - Proceedings of the …, 2024 - openaccess.thecvf.com
We present a new dataset called Real Acoustic Fields (RAF) that captures real acoustic
room data from multiple modalities. The dataset includes high-quality and densely captured …

Robust speaker localization guided by deep learning-based time-frequency masking

ZQ Wang, X Zhang, DL Wang - IEEE/ACM Transactions on …, 2018 - ieeexplore.ieee.org
Deep learning-based time-frequency (TF) masking has dramatically advanced monaural
(single-channel) speech separation and enhancement. This study investigates its potential …