M4singer: A multi-style, multi-singer and musical score provided mandarin singing corpus

L Zhang, R Li, S Wang, L Deng, J Liu… - Advances in …, 2022 - proceedings.neurips.cc
The lack of publicly available high-quality and accurately labeled datasets has long been a
major bottleneck for singing voice synthesis (SVS). To tackle this problem, we present …

Video2music: Suitable music generation from videos using an affective multimodal transformer model

J Kang, S Poria, D Herremans - Expert Systems with Applications, 2024 - Elsevier
Numerous studies in the field of music generation have demonstrated impressive
performance, yet virtually no models are able to directly generate music to match …

[HTML][HTML] A comprehensive review on music transcription

B Bhattarai, J Lee - Applied Sciences, 2023 - mdpi.com
Music transcription is the process of transforming recorded sound of musical performances
into symbolic representations such as sheet music or MIDI files. Extensive research and …

Content-based controls for music large language modeling

L Lin, G **a, J Jiang, Y Zhang - arxiv preprint arxiv:2310.17162, 2023 - arxiv.org
Recent years have witnessed a rapid growth of large-scale language models in the domain
of music audio. Such models enable end-to-end generation of higher-quality music, and …

Harmonizing minds and machines: survey on transformative power of machine learning in music

J Liang - Frontiers in Neurorobotics, 2023 - frontiersin.org
This survey explores the symbiotic relationship between Machine Learning (ML) and music,
focusing on the transformative role of Artificial Intelligence (AI) in the musical sphere …

High resolution guitar transcription via domain adaptation

X Riley, D Edwards, S Dixon - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org
Automatic music transcription (AMT) has achieved high accuracy for piano due to the
availability of large, high-quality datasets such as MAESTRO and MAPS, but comparable …

Training a singing transcription model using connectionist temporal classification loss and cross-entropy loss

JY Wang, JSR Jang - IEEE/ACM Transactions on Audio …, 2022 - ieeexplore.ieee.org
In this paper, we propose a method that uses a combination of the Connectionist Temporal
Classification (CTC) loss and the cross-entropy loss to train a note-level singing transcription …

Towards automatic transcription of polyphonic electric guitar music: A new dataset and a multi-loss transformer model

YH Chen, WY Hsiao, TK Hsieh… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
In this paper, we propose a new dataset named EGDB, that contains transcriptions of the
electric guitar performance of 240 tablatures rendered with different tones. Moreover, we …

A phoneme-informed neural network model for note-level singing transcription

S Yong, L Su, J Nam - ICASSP 2023-2023 IEEE International …, 2023 - ieeexplore.ieee.org
Note-level automatic music transcription is one of the most representative music information
retrieval (MIR) tasks and has been studied for various instruments to understand music …

Perceptual musical features for interpretable audio tagging

V Lyberatos, S Kantarelis, E Dervakos… - … on Acoustics, Speech …, 2024 - ieeexplore.ieee.org
In the age of music streaming platforms, the task of automatically tagging music audio has
garnered significant attention, driving researchers to devise methods aimed at enhancing …