Foundation models for music: A survey

Y Ma, A Øland, A Ragni, BMS Del Sette, C Saitis… - arxiv preprint arxiv …, 2024 - arxiv.org
In recent years, foundation models (FMs) such as large language models (LLMs) and latent
diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This …

Mert: Acoustic music understanding model with large-scale self-supervised training

Y Li, R Yuan, G Zhang, Y Ma, X Chen, H Yin… - arxiv preprint arxiv …, 2023 - arxiv.org
Self-supervised learning (SSL) has recently emerged as a promising paradigm for training
generalisable models on large-scale data in the fields of vision, text, and speech. Although …

Marble: Music audio representation benchmark for universal evaluation

R Yuan, Y Ma, Y Li, G Zhang, X Chen… - Advances in …, 2024 - proceedings.neurips.cc
In the era of extensive intersection between art and Artificial Intelligence (AI), such as image
generation and fiction co-creation, AI for music remains relatively nascent, particularly in …

Mupt: A generative symbolic music pretrained transformer

X Qu, Y Bai, Y Ma, Z Zhou, KM Lo, J Liu, R Yuan… - arxiv preprint arxiv …, 2024 - arxiv.org
In this paper, we explore the application of Large Language Models (LLMs) to the pre-
training of music. While the prevalent use of MIDI in music modeling is well-established, our …

MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training

LI Yizhi, R Yuan, G Zhang, Y Ma, X Chen… - The Twelfth …, 2023 - openreview.net
Self-supervised learning (SSL) has recently emerged as a promising paradigm for training
generalisable models on large-scale data in the fields of vision, text, and speech. Although …

Learning music representations with wav2vec 2.0

A Ragano, E Benetos, A Hines - 2023 31st Irish Conference on …, 2023 - ieeexplore.ieee.org
Learning music representations that are general-purpose offers the flexibility to finetune
several downstream tasks using smaller datasets. The wav2vec 2.0 speech representation …

Audio Recognition of the Percussion Sounds Generated by a 3D Auto-Drum Machine System via Machine Learning

S Brezas, A Skoulakis, M Kaliakatsos-Papakostas… - Electronics, 2024 - mdpi.com
A novel 3D auto-drum machine system for the generation and recording of percussion
sounds is developed and presented. The capabilities of the machine, along with a …

Audio Contrastive based Fine-tuning

Y Wang, Q Liang, C **ao, Y Li, NA Moubayed… - arxiv preprint arxiv …, 2023 - arxiv.org
Audio classification plays a crucial role in speech and sound processing tasks with a wide
range of applications. There still remains a challenge of striking the right balance between …

A Comparative Study of Pre-trained Audio and Speech Models for Heart Sound Detection

Y Duan, C Yang, Z Zhao, Y Jiang, Y Wang… - National Conference on …, 2023 - Springer
Cardiovascular disease screening is critically anchored in heart sound auscultation. As
deep learning methodologies advance, the impetus toward automating heart sound …

[PDF][PDF] A Computational Approach to Analysis and Detection of Singing Techniques

Y Yamamoto - 2024 - tsukuba.repo.nii.ac.jp
Singing voice is one of the most essential elements of music. It provides impactful emotional
expressions through melody and lyrics and has the potential to move people's hearts and …