An overview of deep-learning-based audio-visual speech enhancement and separation
Speech enhancement and speech separation are two related tasks, whose purpose is to
extract either one or more target speech signals, respectively, from a mixture of sounds …
extract either one or more target speech signals, respectively, from a mixture of sounds …
Supervised speech separation based on deep learning: An overview
Speech separation is the task of separating target speech from background interference.
Traditionally, speech separation is studied as a signal processing problem. A more recent …
Traditionally, speech separation is studied as a signal processing problem. A more recent …
Real time speech enhancement in the waveform domain
A Defossez, G Synnaeve, Y Adi - ar** with gated convolutional recurrent networks for monaural speech enhancement
Phase is important for perceptual quality of speech. However, it seems intractable to directly
estimate phase spectra through supervised learning due to their lack of spectrotemporal …
estimate phase spectra through supervised learning due to their lack of spectrotemporal …
Wham!: Extending speech separation to noisy environments
G Wichern, J Antognini, M Flynn, LR Zhu… - ar** speakers using
a single audio channel has brought us closer to solving the cocktail party problem. However …
a single audio channel has brought us closer to solving the cocktail party problem. However …
Light gated recurrent units for speech recognition
A field that has directly benefited from the recent advances in deep learning is automatic
speech recognition (ASR). Despite the great achievements of the past decades, however, a …
speech recognition (ASR). Despite the great achievements of the past decades, however, a …
Deep learning for environmentally robust speech recognition: An overview of recent developments
Eliminating the negative effect of non-stationary environmental noise is a long-standing
research topic for automatic speech recognition but still remains an important challenge …
research topic for automatic speech recognition but still remains an important challenge …
Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks
Separation of speech embedded in non-stationary interference is a challenging problem that
has recently seen dramatic improvements using deep network-based methods. Previous …
has recently seen dramatic improvements using deep network-based methods. Previous …
[PDF][PDF] Investigating RNN-based speech enhancement methods for noise-robust Text-to-Speech.
The quality of text-to-speech (TTS) voices built from noisy speech is compromised.
Enhancing the speech data before training has been shown to improve quality but voices …
Enhancing the speech data before training has been shown to improve quality but voices …