DisMix: Disentangling Mixtures of Musical Instruments for Source-level Pitch and Timbre Manipulation

YJ Luo, KW Cheuk, W Choi, T Uesaka… - arxiv preprint arxiv …, 2024 - arxiv.org
Existing work on pitch and timbre disentanglement has been mostly focused on single-
instrument music audio, excluding the cases where multiple instruments are presented. To …

MR-MT3: Memory Retaining Multi-Track Music Transcription to Mitigate Instrument Leakage

HH Tan, KW Cheuk, T Cho, WH Liao… - arxiv preprint arxiv …, 2024 - arxiv.org
This paper presents enhancements to the MT3 model, a state-of-the-art (SOTA) token-based
multi-instrument automatic music transcription (AMT) model. Despite SOTA performance …

Pitch-aware generative pretraining improves multi-pitch estimation with scarce data

M Pilataki, M Mauch, S Dixon - Proceedings of the 6th ACM International …, 2024 - dl.acm.org
We demonstrate that pretrained generative models can learn representations that are useful
for multi-pitch estimation. We explore representations extracted from DAC, a state-of-the-art …

ClaveNet: Generating Afro-Cuban Drum Patterns through Data Augmentation

D Flores García, H Flores García… - Proceedings of the 19th …, 2024 - dl.acm.org
We present ClaveNet: a generative MIDI model for Afro-Cuban percussion. We adapt the
Monotonic Groove Transformer (MGT)—originally trained on the Groove MIDI Dataset …

Disentangling Multi-instrument Music Audio for Source-level Pitch and Timbre Manipulation

YJ Luo, KW Cheuk, W Choi, WH Liao, K Toyama… - … NeurIPS 2024 Workshop … - openreview.net
Disentangling pitch and timbre from the audio of a musical instrument involves encoding
these two attributes as separate latent representations, allowing the synthesis of instrument …