[HTML][HTML] A survey of sound source localization with deep learning methods
This article is a survey of deep learning methods for single and multiple sound source
localization, with a focus on sound source localization in indoor environments, where …
localization, with a focus on sound source localization in indoor environments, where …
[HTML][HTML] Machine learning in acoustics: Theory and applications
Acoustic data provide scientific and engineering insights in fields ranging from biology and
communications to ocean and Earth science. We survey the recent advances and …
communications to ocean and Earth science. We survey the recent advances and …
Fullsubnet: A full-band and sub-band fusion model for real-time single-channel speech enhancement
This paper proposes a full-band and sub-band fusion model, named as FullSubNet, for
single-channel real-time speech enhancement. Full-band and sub-band refer to the models …
single-channel real-time speech enhancement. Full-band and sub-band refer to the models …
Multi-speaker DOA estimation using deep convolutional networks trained with noise signals
Supervised learning-based methods for source localization, being data driven, can be
adapted to different acoustic conditions via training and have been shown to be robust to …
adapted to different acoustic conditions via training and have been shown to be robust to …
A consolidated perspective on multimicrophone speech enhancement and source separation
Speech enhancement and separation are core problems in audio signal processing, with
commercial applications in devices as diverse as mobile phones, conference call systems …
commercial applications in devices as diverse as mobile phones, conference call systems …
Broadband DOA estimation using convolutional neural networks trained with noise signals
A convolution neural network (CNN) based classification method for broadband DOA
estimation is proposed, where the phase component of the short-time Fourier transform …
estimation is proposed, where the phase component of the short-time Fourier transform …
Speech processing for digital home assistants: Combining signal processing with deep-learning techniques
Once a popular theme of futuristic science fiction or far-fetched technology forecasts, digital
home assistants with a spoken language interface have become a ubiquitous commodity …
home assistants with a spoken language interface have become a ubiquitous commodity …
Combining spectral and spatial features for deep learning based blind speaker separation
This study tightly integrates complementary spectral and spatial features for deep learning
based multi-channel speaker separation in reverberant environments. The key idea is to …
based multi-channel speaker separation in reverberant environments. The key idea is to …
Real acoustic fields: An audio-visual room acoustics dataset and benchmark
We present a new dataset called Real Acoustic Fields (RAF) that captures real acoustic
room data from multiple modalities. The dataset includes high-quality and densely captured …
room data from multiple modalities. The dataset includes high-quality and densely captured …
Robust speaker localization guided by deep learning-based time-frequency masking
Deep learning-based time-frequency (TF) masking has dramatically advanced monaural
(single-channel) speech separation and enhancement. This study investigates its potential …
(single-channel) speech separation and enhancement. This study investigates its potential …