Foundation models for music: A survey
In recent years, foundation models (FMs) such as large language models (LLMs) and latent
diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This …
diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This …
Mert: Acoustic music understanding model with large-scale self-supervised training
Self-supervised learning (SSL) has recently emerged as a promising paradigm for training
generalisable models on large-scale data in the fields of vision, text, and speech. Although …
generalisable models on large-scale data in the fields of vision, text, and speech. Although …
Marble: Music audio representation benchmark for universal evaluation
In the era of extensive intersection between art and Artificial Intelligence (AI), such as image
generation and fiction co-creation, AI for music remains relatively nascent, particularly in …
generation and fiction co-creation, AI for music remains relatively nascent, particularly in …
Mupt: A generative symbolic music pretrained transformer
In this paper, we explore the application of Large Language Models (LLMs) to the pre-
training of music. While the prevalent use of MIDI in music modeling is well-established, our …
training of music. While the prevalent use of MIDI in music modeling is well-established, our …
MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training
Self-supervised learning (SSL) has recently emerged as a promising paradigm for training
generalisable models on large-scale data in the fields of vision, text, and speech. Although …
generalisable models on large-scale data in the fields of vision, text, and speech. Although …
Learning music representations with wav2vec 2.0
Learning music representations that are general-purpose offers the flexibility to finetune
several downstream tasks using smaller datasets. The wav2vec 2.0 speech representation …
several downstream tasks using smaller datasets. The wav2vec 2.0 speech representation …
Audio Recognition of the Percussion Sounds Generated by a 3D Auto-Drum Machine System via Machine Learning
A novel 3D auto-drum machine system for the generation and recording of percussion
sounds is developed and presented. The capabilities of the machine, along with a …
sounds is developed and presented. The capabilities of the machine, along with a …
Audio Contrastive based Fine-tuning
Audio classification plays a crucial role in speech and sound processing tasks with a wide
range of applications. There still remains a challenge of striking the right balance between …
range of applications. There still remains a challenge of striking the right balance between …
A Comparative Study of Pre-trained Audio and Speech Models for Heart Sound Detection
Cardiovascular disease screening is critically anchored in heart sound auscultation. As
deep learning methodologies advance, the impetus toward automating heart sound …
deep learning methodologies advance, the impetus toward automating heart sound …
[PDF][PDF] A Computational Approach to Analysis and Detection of Singing Techniques
Y Yamamoto - 2024 - tsukuba.repo.nii.ac.jp
Singing voice is one of the most essential elements of music. It provides impactful emotional
expressions through melody and lyrics and has the potential to move people's hearts and …
expressions through melody and lyrics and has the potential to move people's hearts and …