Automatic music transcription: An overview

E Benetos, S Dixon, Z Duan… - IEEE Signal Processing …, 2018 - ieeexplore.ieee.org
The capability of transcribing music audio into music notation is a fascinating example of
human intelligence. It involves perception (analyzing complex auditory scenes), cognition …

A tutorial on deep learning for music information retrieval

K Choi, G Fazekas, K Cho, M Sandler - arxiv preprint arxiv:1709.04396, 2017 - arxiv.org
Following their success in Computer Vision and other areas, deep learning techniques have
recently become widely adopted in Music Information Retrieval (MIR) research. However …

Onsets and frames: Dual-objective piano transcription

C Hawthorne, E Elsen, J Song, A Roberts… - arxiv preprint arxiv …, 2017 - arxiv.org
We advance the state of the art in polyphonic piano music transcription by using a deep
convolutional and recurrent neural network which is trained to jointly predict onsets and …

MT3: Multi-task multitrack music transcription

J Gardner, I Simon, E Manilow, C Hawthorne… - arxiv preprint arxiv …, 2021 - arxiv.org
Automatic Music Transcription (AMT), inferring musical notes from raw audio, is a
challenging task at the core of music understanding. Unlike Automatic Speech Recognition …

High-resolution piano transcription with pedals by regressing onset and offset times

Q Kong, B Li, X Song, Y Wan… - IEEE/ACM Transactions …, 2021 - ieeexplore.ieee.org
Automatic music transcription (AMT) is the task of transcribing audio recordings into
symbolic representations. Recently, neural network-based methods have been applied to …

Video2music: Suitable music generation from videos using an affective multimodal transformer model

J Kang, S Poria, D Herremans - Expert Systems with Applications, 2024 - Elsevier
Numerous studies in the field of music generation have demonstrated impressive
performance, yet virtually no models are able to directly generate music to match …

Learning features of music from scratch

J Thickstun, Z Harchaoui, S Kakade - arxiv preprint arxiv:1611.09827, 2016 - arxiv.org
This paper introduces a new large-scale music dataset, MusicNet, to serve as a source of
supervision and evaluation of machine learning methods for music research. MusicNet …

Hear: Holistic evaluation of audio representations

J Turian, J Shier, HR Khan, B Raj… - NeurIPS 2021 …, 2022 - proceedings.mlr.press
What audio embedding approach generalizes best to a wide range of downstream tasks
across a variety of everyday domains without fine-tuning? The aim of the HEAR benchmark …

nnaudio: An on-the-fly gpu audio to spectrogram conversion toolbox using 1d convolutional neural networks

KW Cheuk, H Anderson, K Agres, D Herremans - IEEE Access, 2020 - ieeexplore.ieee.org
In this paper, we present nnAudio, a new neural network-based audio processing framework
with graphics processing unit (GPU) support that leverages 1D convolutional neural …

ASAP: a dataset of aligned scores and performances for piano transcription

F Foscarin, A Mcleod, P Rigaux… - Proceedings of the …, 2020 - infoscience.epfl.ch
In this paper we present Aligned Scores and Performances (ASAP): a new dataset of 222
digital musical scores aligned with 1068 performances (more than 92 hours) of Western …