Sixty years of frequency-domain monaural speech enhancement: From traditional to deep learning methods

C Zheng, H Zhang, W Liu, X Luo, A Li, X Li… - Trends in …, 2023 - journals.sagepub.com
Frequency-domain monaural speech enhancement has been extensively studied for over
60 years, and a great number of methods have been proposed and applied to many …

Fundamentals, present and future perspectives of speech enhancement

N Das, S Chakraborty, J Chaki, N Padhy… - International Journal of …, 2021 - Springer
Speech enhancement has substantial interest in the utilization of speaker identification,
video-conference, speech transmission through communication channels, speech-based …

Two heads are better than one: A two-stage complex spectral map** approach for monaural speech enhancement

A Li, W Liu, C Zheng, C Fan, X Li - IEEE/ACM Transactions on …, 2021 - ieeexplore.ieee.org
For challenging acoustic scenarios as low signal-to-noise ratios, current speech
enhancement systems usually suffer from performance bottleneck in extracting the target …

Glance and gaze: A collaborative learning framework for single-channel speech enhancement

A Li, C Zheng, L Zhang, X Li - Applied Acoustics, 2022 - Elsevier
The capability of the human to pay attention to both coarse and fine-grained regions has
been applied to computer vision tasks. Motivated by that, we propose a collaborative …

DPT-FSNet: Dual-path transformer based full-band and sub-band fusion network for speech enhancement

F Dang, H Chen, P Zhang - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org
Sub-band models have achieved promising results due to their ability to model local
patterns in the spectrogram. Some studies further improve the performance by fusing sub …

Wavoice: A noise-resistant multi-modal speech recognition system fusing mmwave and audio signals

T Liu, M Gao, F Lin, C Wang, Z Ba, J Han… - Proceedings of the 19th …, 2021 - dl.acm.org
With the advance in automatic speech recognition, voice user interface has gained
popularity recently. Since the COVID-19 pandemic, VUI is increasingly preferred in online …

On loss functions for supervised monaural time-domain speech enhancement

M Kolbæk, ZH Tan, SH Jensen… - IEEE/ACM Transactions …, 2020 - ieeexplore.ieee.org
Many deep learning-based speech enhancement algorithms are designed to minimize the
mean-square error (MSE) in some transform domain between a predicted and a target …

On the compensation between magnitude and phase in speech separation

ZQ Wang, G Wichern, J Le Roux - IEEE Signal Processing …, 2021 - ieeexplore.ieee.org
Deep neural network (DNN) based end-to-end optimization in the complex time-frequency
(TF) domain or time domain has shown considerable potential in monaural speech …

Divide and conquer: A deep CASA approach to talker-independent monaural speaker separation

Y Liu, DL Wang - IEEE/ACM Transactions on audio, speech …, 2019 - ieeexplore.ieee.org
We address talker-independent monaural speaker separation from the perspectives of deep
learning and computational auditory scene analysis (CASA). Specifically, we decompose …

Attention wave-u-net for speech enhancement

R Giri, U Isik, A Krishnaswamy - 2019 IEEE Workshop on …, 2019 - ieeexplore.ieee.org
We propose a novel application of an attention mechanism in neural speech enhancement,
by presenting a U-Net architecture with attention mechanism, which processes the raw …