A review of deep learning techniques for speech processing
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …
learning. The use of multiple processing layers has enabled the creation of models capable …
Deep learning for environmentally robust speech recognition: An overview of recent developments
Eliminating the negative effect of non-stationary environmental noise is a long-standing
research topic for automatic speech recognition but still remains an important challenge …
research topic for automatic speech recognition but still remains an important challenge …
SpeechBrain: A general-purpose speech toolkit
SpeechBrain is an open-source and all-in-one speech toolkit. It is designed to facilitate the
research and development of neural speech processing technologies by being simple …
research and development of neural speech processing technologies by being simple …
A survey on multi-task learning
Multi-Task Learning (MTL) is a learning paradigm in machine learning and its aim is to
leverage useful information contained in multiple related tasks to help improve the …
leverage useful information contained in multiple related tasks to help improve the …
Light gated recurrent units for speech recognition
A field that has directly benefited from the recent advances in deep learning is automatic
speech recognition (ASR). Despite the great achievements of the past decades, however, a …
speech recognition (ASR). Despite the great achievements of the past decades, however, a …
Deep attractor network for single-microphone speaker separation
Despite the overwhelming success of deep learning in various speech processing tasks, the
problem of separating simultaneous speakers in a mixture remains challenging. Two major …
problem of separating simultaneous speakers in a mixture remains challenging. Two major …
Cold diffusion for speech enhancement
Diffusion models have recently shown promising results for difficult enhancement tasks such
as the conditional and unconditional restoration of natural images and audio signals. In this …
as the conditional and unconditional restoration of natural images and audio signals. In this …
End-to-end waveform utterance enhancement for direct evaluation metrics optimization by fully convolutional neural networks
Speech enhancement model is used to map a noisy speech to a clean speech. In the
training stage, an objective function is often adopted to optimize the model parameters …
training stage, an objective function is often adopted to optimize the model parameters …
Speaker-independent speech separation with deep attractor network
Despite the recent success of deep learning for many speech processing tasks, single-
microphone, speaker-independent speech separation remains challenging for two main …
microphone, speaker-independent speech separation remains challenging for two main …
Multichannel signal processing with deep neural networks for automatic speech recognition
Multichannel automatic speech recognition (ASR) systems commonly separate speech
enhancement, including localization, beamforming, and postfiltering, from acoustic …
enhancement, including localization, beamforming, and postfiltering, from acoustic …