A comprehensive survey on pretrained foundation models: A history from bert to chatgpt
Abstract Pretrained Foundation Models (PFMs) are regarded as the foundation for various
downstream tasks across different data modalities. A PFM (eg, BERT, ChatGPT, GPT-4) is …
downstream tasks across different data modalities. A PFM (eg, BERT, ChatGPT, GPT-4) is …
A survey on deep learning for symbolic music generation: Representations, algorithms, evaluations, and challenges
S Ji, X Yang, J Luo - ACM Computing Surveys, 2023 - dl.acm.org
Significant progress has been made in symbolic music generation with the help of deep
learning techniques. However, the tasks covered by symbolic music generation have not …
learning techniques. However, the tasks covered by symbolic music generation have not …
Museformer: Transformer with fine-and coarse-grained attention for music generation
Symbolic music generation aims to generate music scores automatically. A recent trend is to
use Transformer or its variants in music generation, which is, however, suboptimal, because …
use Transformer or its variants in music generation, which is, however, suboptimal, because …
Multimodal pretraining, adaptation, and generation for recommendation: A survey
Personalized recommendation serves as a ubiquitous channel for users to discover
information tailored to their interests. However, traditional recommendation models primarily …
information tailored to their interests. However, traditional recommendation models primarily …
Sparks of large audio models: A survey and outlook
This survey paper provides a comprehensive overview of the recent advancements and
challenges in applying large language models to the field of audio signal processing. Audio …
challenges in applying large language models to the field of audio signal processing. Audio …
MidiBERT-piano: large-scale pre-training for symbolic music understanding
This paper presents an attempt to employ the mask language modeling approach of BERT
to pre-train a 12-layer Transformer model over 4,166 pieces of polyphonic piano MIDI files …
to pre-train a 12-layer Transformer model over 4,166 pieces of polyphonic piano MIDI files …
Musecoco: Generating symbolic music from text
Generating music from text descriptions is a user-friendly mode since the text is a relatively
easy interface for user engagement. While some approaches utilize texts to control music …
easy interface for user engagement. While some approaches utilize texts to control music …
Foundation models for music: A survey
In recent years, foundation models (FMs) such as large language models (LLMs) and latent
diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This …
diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This …
Natural language processing methods for symbolic music generation and information retrieval: A survey
Music is frequently associated with the notion of language as both domains share several
similarities, including the ability for their content to be represented as sequences of symbols …
similarities, including the ability for their content to be represented as sequences of symbols …
Multitrack music transformer
Existing approaches for generating multitrack music with transformer models have been
limited in terms of the number of instruments, the length of the music segments and slow …
limited in terms of the number of instruments, the length of the music segments and slow …