Neural target speech extraction: An overview
K Zmolikova, M Delcroix, T Ochiai… - IEEE Signal …, 2023 - ieeexplore.ieee.org
Humans can listen to a target speaker even in challenging acoustic conditions that have
noise, reverberation, and interfering speakers. This phenomenon is known as the cocktail …
noise, reverberation, and interfering speakers. This phenomenon is known as the cocktail …
Wavesplit: End-to-end speech separation by speaker clustering
We introduce Wavesplit, an end-to-end source separation system. From a single mixture, the
model infers a representation for each source and then estimates each source signal given …
model infers a representation for each source and then estimates each source signal given …
Weakly-supervised audio-visual segmentation
S Mo, B Raj - Advances in Neural Information Processing …, 2024 - proceedings.neurips.cc
Audio-visual segmentation is a challenging task that aims to predict pixel-level masks for
sound sources in a video. Previous work applied a comprehensive manually designed …
sound sources in a video. Previous work applied a comprehensive manually designed …
Improving universal sound separation using sound classification
Deep learning approaches have recently achieved impressive performance on both audio
source separation and sound classification. Most audio source separation approaches focus …
source separation and sound classification. Most audio source separation approaches focus …
Meta-learning extractors for music source separation
We propose a hierarchical meta-learning-inspired model for music source separation (Meta-
TasNet) in which a generator model is used to predict the weights of individual extractor …
TasNet) in which a generator model is used to predict the weights of individual extractor …
Move2hear: Active audio-visual source separation
We introduce the active audio-visual source separation problem, where an agent must move
intelligently in order to better isolate the sounds coming from an object of interest in its …
intelligently in order to better isolate the sounds coming from an object of interest in its …
SoundBeam: Target sound extraction conditioned on sound-class labels and enrollment clues for increased performance and continuous learning
M Delcroix, JB Vázquez, T Ochiai… - … on Audio, Speech …, 2022 - ieeexplore.ieee.org
In many situations, we would like to hear desired sound events (SEs) while being able to
ignore interference. Target sound extraction (TSE) tackles this problem by estimating the …
ignore interference. Target sound extraction (TSE) tackles this problem by estimating the …
A unified model for zero-shot music source separation, transcription and synthesis
We propose a unified model for three inter-related tasks: 1) to\textit {separate} individual
sound sources from a mixed music audio, 2) to\textit {transcribe} each sound source to MIDI …
sound sources from a mixed music audio, 2) to\textit {transcribe} each sound source to MIDI …
Universal source separation with weakly labelled data
Universal source separation (USS) is a fundamental research task for computational
auditory scene analysis, which aims to separate mono recordings into individual source …
auditory scene analysis, which aims to separate mono recordings into individual source …
Heterogeneous target speech separation
We introduce a new paradigm for single-channel target source separation where the
sources of interest can be distinguished using non-mutually exclusive concepts (eg …
sources of interest can be distinguished using non-mutually exclusive concepts (eg …