Sixty years of frequency-domain monaural speech enhancement: From traditional to deep learning methods
Frequency-domain monaural speech enhancement has been extensively studied for over
60 years, and a great number of methods have been proposed and applied to many …
60 years, and a great number of methods have been proposed and applied to many …
Fundamentals, present and future perspectives of speech enhancement
Speech enhancement has substantial interest in the utilization of speaker identification,
video-conference, speech transmission through communication channels, speech-based …
video-conference, speech transmission through communication channels, speech-based …
Two heads are better than one: A two-stage complex spectral map** approach for monaural speech enhancement
For challenging acoustic scenarios as low signal-to-noise ratios, current speech
enhancement systems usually suffer from performance bottleneck in extracting the target …
enhancement systems usually suffer from performance bottleneck in extracting the target …
Glance and gaze: A collaborative learning framework for single-channel speech enhancement
The capability of the human to pay attention to both coarse and fine-grained regions has
been applied to computer vision tasks. Motivated by that, we propose a collaborative …
been applied to computer vision tasks. Motivated by that, we propose a collaborative …
DPT-FSNet: Dual-path transformer based full-band and sub-band fusion network for speech enhancement
Sub-band models have achieved promising results due to their ability to model local
patterns in the spectrogram. Some studies further improve the performance by fusing sub …
patterns in the spectrogram. Some studies further improve the performance by fusing sub …
Wavoice: A noise-resistant multi-modal speech recognition system fusing mmwave and audio signals
With the advance in automatic speech recognition, voice user interface has gained
popularity recently. Since the COVID-19 pandemic, VUI is increasingly preferred in online …
popularity recently. Since the COVID-19 pandemic, VUI is increasingly preferred in online …
On loss functions for supervised monaural time-domain speech enhancement
Many deep learning-based speech enhancement algorithms are designed to minimize the
mean-square error (MSE) in some transform domain between a predicted and a target …
mean-square error (MSE) in some transform domain between a predicted and a target …
On the compensation between magnitude and phase in speech separation
Deep neural network (DNN) based end-to-end optimization in the complex time-frequency
(TF) domain or time domain has shown considerable potential in monaural speech …
(TF) domain or time domain has shown considerable potential in monaural speech …
Divide and conquer: A deep CASA approach to talker-independent monaural speaker separation
We address talker-independent monaural speaker separation from the perspectives of deep
learning and computational auditory scene analysis (CASA). Specifically, we decompose …
learning and computational auditory scene analysis (CASA). Specifically, we decompose …
Attention wave-u-net for speech enhancement
We propose a novel application of an attention mechanism in neural speech enhancement,
by presenting a U-Net architecture with attention mechanism, which processes the raw …
by presenting a U-Net architecture with attention mechanism, which processes the raw …