Sixty years of frequency-domain monaural speech enhancement: From traditional to deep learning methods

C Zheng, H Zhang, W Liu, X Luo, A Li, X Li… - Trends in …, 2023 - journals.sagepub.com
Frequency-domain monaural speech enhancement has been extensively studied for over
60 years, and a great number of methods have been proposed and applied to many …

Deep learning for audio signal processing

H Purwins, B Li, T Virtanen, J Schlüter… - IEEE Journal of …, 2019 - ieeexplore.ieee.org
Given the recent surge in developments of deep learning, this paper provides a review of the
state-of-the-art deep learning techniques for audio signal processing. Speech, music, and …

TCNN: Temporal convolutional neural network for real-time speech enhancement in the time domain

A Pandey, DL Wang - ICASSP 2019-2019 IEEE International …, 2019 - ieeexplore.ieee.org
This work proposes a fully convolutional neural network (CNN) for real-time speech
enhancement in the time domain. The proposed CNN is an encoder-decoder based …

Learning complex spectral map** with gated convolutional recurrent networks for monaural speech enhancement

K Tan, DL Wang - IEEE/ACM Transactions on Audio, Speech …, 2019 - ieeexplore.ieee.org
Phase is important for perceptual quality of speech. However, it seems intractable to directly
estimate phase spectra through supervised learning due to their lack of spectrotemporal …

A new framework for CNN-based speech enhancement in the time domain

A Pandey, DL Wang - IEEE/ACM Transactions on Audio …, 2019 - ieeexplore.ieee.org
This paper proposes a new learning mechanism for a fully convolutional neural network
(CNN) to address speech enhancement in the time domain. The CNN takes as input the time …

On loss functions for supervised monaural time-domain speech enhancement

M Kolbæk, ZH Tan, SH Jensen… - IEEE/ACM Transactions …, 2020 - ieeexplore.ieee.org
Many deep learning-based speech enhancement algorithms are designed to minimize the
mean-square error (MSE) in some transform domain between a predicted and a target …

Dense CNN with self-attention for time-domain speech enhancement

A Pandey, DL Wang - IEEE/ACM transactions on audio, speech …, 2021 - ieeexplore.ieee.org
Speech enhancement in the time domain is becoming increasingly popular in recent years,
due to its capability to jointly enhance both the magnitude and the phase of speech. In this …

CochleaNet: A robust language-independent audio-visual model for real-time speech enhancement

M Gogate, K Dashtipour, A Adeel, A Hussain - Information Fusion, 2020 - Elsevier
Noisy situations cause huge problems for the hearing-impaired, as hearing aids often make
speech more audible but do not always restore intelligibility. In noisy settings, humans …

Differentiable consistency constraints for improved deep speech enhancement

S Wisdom, JR Hershey, K Wilson… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org
In recent years, deep networks have led to dramatic improvements in speech enhancement
by framing it as a data-driven pattern recognition problem. In many modern enhancement …

Densely connected neural network with dilated convolutions for real-time speech enhancement in the time domain

A Pandey, DL Wang - ICASSP 2020-2020 IEEE International …, 2020 - ieeexplore.ieee.org
In this work, we propose a fully convolutional neural network for real-time speech
enhancement in the time domain. The proposed network is an encoder-decoder based …