Dual-path rnn: efficient long sequence modeling for time-domain single-channel speech separation

Y Luo, Z Chen, T Yoshioka - ICASSP 2020-2020 IEEE …, 2020 - ieeexplore.ieee.org
Recent studies in deep learning-based speech separation have proven the superiority of
time-domain approaches to conventional time-frequency-based methods. Unlike the time …

Fsd50k: an open dataset of human-labeled sound events

E Fonseca, X Favory, J Pons, F Font… - IEEE/ACM Transactions …, 2021 - ieeexplore.ieee.org
Most existing datasets for sound event recognition (SER) are relatively small and/or domain-
specific, with the exception of AudioSet, based on over 2 M tracks from YouTube videos and …

Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis

P Ochieng - Artificial Intelligence Review, 2023 - Springer
Deep neural networks (DNN) techniques have become pervasive in domains such as
natural language processing and computer vision. They have achieved great success in …

Two heads are better than one: A two-stage complex spectral map** approach for monaural speech enhancement

A Li, W Liu, C Zheng, C Fan, X Li - IEEE/ACM Transactions on …, 2021 - ieeexplore.ieee.org
For challenging acoustic scenarios as low signal-to-noise ratios, current speech
enhancement systems usually suffer from performance bottleneck in extracting the target …

A comprehensive review of polyphonic sound event detection

TK Chan, CS Chin - IEEE Access, 2020 - ieeexplore.ieee.org
One of the most amazing functions of the human auditory system is the ability to detect all
kinds of sound events in the environment. With the technologies and hardware advances …

Unsupervised sound separation using mixture invariant training

S Wisdom, E Tzinis, H Erdogan… - Advances in neural …, 2020 - proceedings.neurips.cc
In recent years, rapid progress has been made on the problem of single-channel sound
separation using supervised training of deep neural networks. In such supervised …

Separate anything you describe

X Liu, Q Kong, Y Zhao, H Liu, Y Yuan… - … on Audio, Speech …, 2024 - ieeexplore.ieee.org
Language-queried audio source separation (LASS) is a new paradigm for computational
auditory scene analysis (CASA). LASS aims to separate a target sound from an audio …

Sudo rm-rf: Efficient networks for universal audio source separation

E Tzinis, Z Wang, P Smaragdis - 2020 IEEE 30th International …, 2020 - ieeexplore.ieee.org
In this paper, we present an efficient neural network for end-to-end general purpose audio
source separation. Specifically, the backbone structure of this convolutional network is the …

Asteroid: the PyTorch-based audio source separation toolkit for researchers

M Pariente, S Cornell, J Cosentino… - arxiv preprint arxiv …, 2020 - arxiv.org
This paper describes Asteroid, the PyTorch-based audio source separation toolkit for
researchers. Inspired by the most successful neural source separation systems, it provides …

Dual-signal transformation LSTM network for real-time noise suppression

NL Westhausen, BT Meyer - arxiv preprint arxiv:2005.07551, 2020 - arxiv.org
This paper introduces a dual-signal transformation LSTM network (DTLN) for real-time
speech enhancement as part of the Deep Noise Suppression Challenge (DNS-Challenge) …