A comprehensive survey of ai-generated content (aigc): A history of generative ai from gan to chatgpt
Recently, ChatGPT, along with DALL-E-2 and Codex, has been gaining significant attention
from society. As a result, many individuals have become interested in related resources and …
from society. As a result, many individuals have become interested in related resources and …
A survey of ai-generated content (aigc)
Recently, Artificial Intelligence Generated Content (AIGC) has gained significant attention
from society, especially with the rise of Generative AI (GAI) techniques such as ChatGPT …
from society, especially with the rise of Generative AI (GAI) techniques such as ChatGPT …
Mulan: A joint embedding of music audio and natural language
Music tagging and content-based retrieval systems have traditionally been constructed
using pre-defined ontologies covering a rigid set of music attributes or text queries. This …
using pre-defined ontologies covering a rigid set of music attributes or text queries. This …
Marble: Music audio representation benchmark for universal evaluation
In the era of extensive intersection between art and Artificial Intelligence (AI), such as image
generation and fiction co-creation, AI for music remains relatively nascent, particularly in …
generation and fiction co-creation, AI for music remains relatively nascent, particularly in …
The song describer dataset: a corpus of audio captions for music-and-language evaluation
We introduce the Song Describer dataset (SDD), a new crowdsourced corpus of high-quality
audio-caption pairs, designed for the evaluation of music-and-language models. The …
audio-caption pairs, designed for the evaluation of music-and-language models. The …
Contrastive audio-language learning for music
As one of the most intuitive interfaces known to humans, natural language has the potential
to mediate many tasks that involve human-computer interaction, especially in application …
to mediate many tasks that involve human-computer interaction, especially in application …
Supervised and unsupervised learning of audio representations for music understanding
In this work, we provide a broad comparative analysis of strategies for pre-training audio
understanding models for several tasks in the music domain, including labelling of genre …
understanding models for several tasks in the music domain, including labelling of genre …
Foundation models for music: A survey
In recent years, foundation models (FMs) such as large language models (LLMs) and latent
diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This …
diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This …
Multi-source diffusion models for simultaneous music generation and separation
In this work, we define a diffusion-based generative model capable of both music synthesis
and source separation by learning the score of the joint probability density of sources …
and source separation by learning the score of the joint probability density of sources …
Toward universal text-to-music retrieval
This paper introduces effective design choices for text-to-music retrieval systems. An ideal
text-based retrieval system would support various input queries such as pre-defined tags …
text-based retrieval system would support various input queries such as pre-defined tags …