Foundation models for music: A survey
In recent years, foundation models (FMs) such as large language models (LLMs) and latent
diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This …
diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This …
Wave-u-net: A multi-scale neural network for end-to-end audio source separation
Models for audio source separation usually operate on the magnitude spectrum, which
ignores phase information and makes separation performance dependant on hyper …
ignores phase information and makes separation performance dependant on hyper …
Source separation in ecoacoustics: A roadmap towards versatile soundscape information retrieval
A comprehensive assessment of ecosystem dynamics requires the monitoring of biological,
physical and social changes. Changes that cannot be observed visually may be trackable …
physical and social changes. Changes that cannot be observed visually may be trackable …
Co-separating sounds of visual objects
Learning how objects sound from video is challenging, since they often heavily overlap in a
single audio channel. Current methods for visually-guided audio source separation sidestep …
single audio channel. Current methods for visually-guided audio source separation sidestep …
Demucs: Deep extractor for music sources with extra unlabeled data remixed
We study the problem of source separation for music using deep learning with four known
sources: drums, bass, vocals and other accompaniments. State-of-the-art approaches …
sources: drums, bass, vocals and other accompaniments. State-of-the-art approaches …
A differentiable perceptual audio metric learned from just noticeable differences
Many audio processing tasks require perceptual assessment. The``gold standard``of
obtaining human judgments is time-consuming, expensive, and cannot be used as an …
obtaining human judgments is time-consuming, expensive, and cannot be used as an …
Score-informed source separation of choral music
M Gover - 2020 - escholarship.mcgill.ca
La séparation de sources sonores consiste à extraire une ou plusieurs sources présentant
un attrait significatif d'un enregistrement contenant plusieurs sources sonores. Ces …
un attrait significatif d'un enregistrement contenant plusieurs sources sonores. Ces …
SoundBeam: Target sound extraction conditioned on sound-class labels and enrollment clues for increased performance and continuous learning
M Delcroix, JB Vázquez, T Ochiai… - … on Audio, Speech …, 2022 - ieeexplore.ieee.org
In many situations, we would like to hear desired sound events (SEs) while being able to
ignore interference. Target sound extraction (TSE) tackles this problem by estimating the …
ignore interference. Target sound extraction (TSE) tackles this problem by estimating the …
Source separation with weakly labelled data: An approach to computational auditory scene analysis
Source separation is the task of separating an audio recording into individual sound
sources. Source separation is fundamental for computational auditory scene analysis …
sources. Source separation is fundamental for computational auditory scene analysis …
Snore-GANs: Improving automatic snore sound classification with synthesized data
One of the frontier issues that severely hamper the development of automatic snore sound
classification (ASSC) associates to the lack of sufficient supervised training data. To cope …
classification (ASSC) associates to the lack of sufficient supervised training data. To cope …