SpeechBrain: A general-purpose speech toolkit

M Ravanelli, T Parcollet, P Plantinga, A Rouhe… - arxiv preprint arxiv …, 2021 - arxiv.org
SpeechBrain is an open-source and all-in-one speech toolkit. It is designed to facilitate the
research and development of neural speech processing technologies by being simple …

Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis

P Ochieng - Artificial Intelligence Review, 2023 - Springer
Deep neural networks (DNN) techniques have become pervasive in domains such as
natural language processing and computer vision. They have achieved great success in …

TF-GridNet: Integrating full-and sub-band modeling for speech separation

ZQ Wang, S Cornell, S Choi, Y Lee… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org
We propose TF-GridNet for speech separation. The model is a novel deep neural network
(DNN) integrating full-and sub-band modeling in the time-frequency (TF) domain. It stacks …

Librimix: An open-source dataset for generalizable speech separation

J Cosentino, M Pariente, S Cornell, A Deleforge… - arxiv preprint arxiv …, 2020 - arxiv.org
In recent years, wsj0-2mix has become the reference dataset for single-channel speech
separation. Most deep learning-based speech separation models today are benchmarked …

Wavesplit: End-to-end speech separation by speaker clustering

N Zeghidour, D Grangier - IEEE/ACM Transactions on Audio …, 2021 - ieeexplore.ieee.org
We introduce Wavesplit, an end-to-end source separation system. From a single mixture, the
model infers a representation for each source and then estimates each source signal given …

Voice separation with an unknown number of multiple speakers

E Nachmani, Y Adi, L Wolf - International Conference on …, 2020 - proceedings.mlr.press
We present a new method for separating a mixed audio sequence, in which multiple voices
speak simultaneously. The new method employs gated neural networks that are trained to …

Asteroid: the PyTorch-based audio source separation toolkit for researchers

M Pariente, S Cornell, J Cosentino… - arxiv preprint arxiv …, 2020 - arxiv.org
This paper describes Asteroid, the PyTorch-based audio source separation toolkit for
researchers. Inspired by the most successful neural source separation systems, it provides …

SpatialNet: Extensively learning spatial information for multichannel joint speech separation, denoising and dereverberation

C Quan, X Li - IEEE/ACM Transactions on Audio, Speech, and …, 2024 - ieeexplore.ieee.org
This work proposes a neural network to extensively exploit spatial information for
multichannel joint speech separation, denoising and dereverberation, named SpatialNet. In …

Mossformer: Pushing the performance limit of monaural speech separation using gated single-head transformer with convolution-augmented joint self-attentions

S Zhao, B Ma - … 2023-2023 IEEE International Conference on …, 2023 - ieeexplore.ieee.org
Transformer based models have provided significant performance improvements in
monaural speech separation. However, there is still a performance gap compared to a …

Dual-signal transformation LSTM network for real-time noise suppression

NL Westhausen, BT Meyer - arxiv preprint arxiv:2005.07551, 2020 - arxiv.org
This paper introduces a dual-signal transformation LSTM network (DTLN) for real-time
speech enhancement as part of the Deep Noise Suppression Challenge (DNS-Challenge) …