A review of deep learning techniques for speech processing
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …
learning. The use of multiple processing layers has enabled the creation of models capable …
Attention is all you need in speech separation
Recurrent Neural Networks (RNNs) have long been the dominant architecture in sequence-
to-sequence learning. RNNs, however, are inherently sequential models that do not allow …
to-sequence learning. RNNs, however, are inherently sequential models that do not allow …
Dual-path transformer network: Direct context-aware modeling for end-to-end monaural speech separation
The dominant speech separation models are based on complex recurrent or convolution
neural network that model speech sequences indirectly conditioning on context, such as …
neural network that model speech sequences indirectly conditioning on context, such as …
Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis
P Ochieng - Artificial Intelligence Review, 2023 - Springer
Deep neural networks (DNN) techniques have become pervasive in domains such as
natural language processing and computer vision. They have achieved great success in …
natural language processing and computer vision. They have achieved great success in …
Librimix: An open-source dataset for generalizable speech separation
In recent years, wsj0-2mix has become the reference dataset for single-channel speech
separation. Most deep learning-based speech separation models today are benchmarked …
separation. Most deep learning-based speech separation models today are benchmarked …
Unsupervised sound separation using mixture invariant training
In recent years, rapid progress has been made on the problem of single-channel sound
separation using supervised training of deep neural networks. In such supervised …
separation using supervised training of deep neural networks. In such supervised …
Voice separation with an unknown number of multiple speakers
We present a new method for separating a mixed audio sequence, in which multiple voices
speak simultaneously. The new method employs gated neural networks that are trained to …
speak simultaneously. The new method employs gated neural networks that are trained to …
TF-GridNet: Integrating full-and sub-band modeling for speech separation
We propose TF-GridNet for speech separation. The model is a novel deep neural network
(DNN) integrating full-and sub-band modeling in the time-frequency (TF) domain. It stacks …
(DNN) integrating full-and sub-band modeling in the time-frequency (TF) domain. It stacks …
Sudo rm-rf: Efficient networks for universal audio source separation
In this paper, we present an efficient neural network for end-to-end general purpose audio
source separation. Specifically, the backbone structure of this convolutional network is the …
source separation. Specifically, the backbone structure of this convolutional network is the …
Asteroid: the PyTorch-based audio source separation toolkit for researchers
This paper describes Asteroid, the PyTorch-based audio source separation toolkit for
researchers. Inspired by the most successful neural source separation systems, it provides …
researchers. Inspired by the most successful neural source separation systems, it provides …