Foundation models for music: A survey

Y Ma, A Øland, A Ragni, BMS Del Sette, C Saitis… - ar** Voice Generation
Z Ning, S Wang, Y Jiang, J Yao, L He, S Pan… - ar** a versatile deep neural network to model music audio is crucial in MIR. This task
is challenging due to the intricate spectral variations inherent in music signals, which convey …

Scoring Time Intervals using Non-Hierarchical Transformer For Automatic Piano Transcription

Y Yan, Z Duan - arxiv preprint arxiv:2404.09466, 2024 - arxiv.org
The neural semi-Markov Conditional Random Field (semi-CRF) framework has
demonstrated promise for event-based piano transcription. In this framework, all events …

Self-supervised music source separation using vector-quantized source category estimates

M Pasini, S Lattner, G Fazekas - arxiv preprint arxiv:2311.13058, 2023 - arxiv.org
Music source separation is focused on extracting distinct sonic elements from composite
tracks. Historically, many methods have been grounded in supervised learning …

Optimizing music source separation in complex audio environments through progressive self-knowledge distillation

CH Han, SH Lee - 2024 IEEE International Conference on …, 2024 - ieeexplore.ieee.org
This technical report presents our approach for The ICASSP 2024 SP Cadenza Grand
Challenge (CADICASSP24), focusing on effective source separation. In the scenario …