Foundation models for music: A survey
In recent years, foundation models (FMs) such as large language models (LLMs) and latent
diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This …
diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This …
Separate anything you describe
Language-queried audio source separation (LASS) is a new paradigm for computational
auditory scene analysis (CASA). LASS aims to separate a target sound from an audio …
auditory scene analysis (CASA). LASS aims to separate a target sound from an audio …
Few-shot class-incremental audio classification via discriminative prototype learning
In real-world scenarios, new audio classes with insufficient samples usually emerge
continually, which motivates the study of few-shot class-incremental audio classification …
continually, which motivates the study of few-shot class-incremental audio classification …
[PDF][PDF] Source Separation of Piano Concertos with Test-Time Adaptation.
Music source separation (MSS) aims at decomposing a music recording into its constituent
sources, such as a lead instrument and the accompaniment. Despite the difficulties in MSS …
sources, such as a lead instrument and the accompaniment. Despite the difficulties in MSS …
A Stem-Agnostic Single-Decoder System for Music Source Separation Beyond Four Stems
Despite significant recent progress across multiple subtasks of audio source separation, few
music source separation systems support separation beyond the four-stem vocals, drums …
music source separation systems support separation beyond the four-stem vocals, drums …
LC-Protonets: Multi-label Few-shot learning for world music audio tagging
We introduce Label-Combination Prototypical Networks (LC-Protonets) to address the
problem of multi-label few-shot classification, where a model must generalize to new classes …
problem of multi-label few-shot classification, where a model must generalize to new classes …
Task-Aware Unified Source Separation
Several attempts have been made to handle multiple source separation tasks such as
speech enhancement, speech separation, sound event separation, music source separation …
speech enhancement, speech separation, sound event separation, music source separation …
Cross-Domain Contrastive Learning-Based Few-Shot Underwater Acoustic Target Recognition
Underwater Acoustic Target Recognition (UATR) plays a crucial role in underwater detection
devices. However, due to the difficulty and high cost of collecting data in the underwater …
devices. However, due to the difficulty and high cost of collecting data in the underwater …
[HTML][HTML] Selective Annotation of Few Data for Beat Tracking of Latin American Music Using Rhythmic Features
Training state-of-the-art beat tracking models usually requires large amounts of annotated
data. It is widely known that data annotation is a time-consuming process and generally …
data. It is widely known that data annotation is a time-consuming process and generally …
Separate This, and All of these Things Around It: Music Source Separation via Hyperellipsoidal Queries
Music source separation is an audio-to-audio retrieval task of extracting one or more
constituent components, or composites thereof, from a musical audio mixture. Each of these …
constituent components, or composites thereof, from a musical audio mixture. Each of these …