Google Академія

S Latif, M Shoukat, F Shamshad, M Usama… - arxiv preprint arxiv …, 2023 - arxiv.org

This survey paper provides a comprehensive overview of the recent advancements and
challenges in applying large language models to the field of audio signal processing. Audio …

Зберегти Послатися Цитовано в 35 джерелах Пов’язані статті Кількість версій: 4 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Simple and controllable music generation

J Copet, F Kreuk, I Gat, T Remez… - Advances in …, 2023 - proceedings.neurips.cc

We tackle the task of conditional music generation. We introduce MusicGen, a single
Language Model (LM) that operates over several streams of compressed discrete music …

Зберегти Послатися Цитовано в 474 джерелах Пов’язані статті Кількість версій: 8 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Foundation models for music: A survey

Y Ma, A Øland, A Ragni, BMS Del Sette, C Saitis… - arxiv preprint arxiv …, 2024 - arxiv.org

In recent years, foundation models (FMs) such as large language models (LLMs) and latent
diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This …

Зберегти Послатися Цитовано в 12 джерелах Пов’язані статті Кількість версій: 4 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Textually pretrained speech language models

M Hassid, T Remez, TA Nguyen, I Gat… - Advances in …, 2023 - proceedings.neurips.cc

Speech language models (SpeechLMs) process and generate acoustic data only, without
textual supervision. In this work, we propose TWIST, a method for training SpeechLMs using …

Зберегти Послатися Цитовано в 61 джерелах Пов’язані статті Кількість версій: 5 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Soundstorm: Efficient parallel audio generation

Z Borsos, M Sharifi, D Vincent, E Kharitonov… - arxiv preprint arxiv …, 2023 - arxiv.org

We present SoundStorm, a model for efficient, non-autoregressive audio generation.
SoundStorm receives as input the semantic tokens of AudioLM, and relies on bidirectional …

Зберегти Послатися Цитовано в 102 джерелах Пов’язані статті Кількість версій: 3 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Music controlnet: Multiple time-varying controls for music generation

SL Wu, C Donahue, S Watanabe… - IEEE/ACM Transactions …, 2024 - ieeexplore.ieee.org

Text-to-music generation models are now capable of generating high-quality music audio in
broad styles. However, text control is primarily suitable for the manipulation of global musical …

Зберегти Послатися Цитовано в 57 джерелах Пов’язані статті Кількість версій: 4

[Free GPT-4]
[DeepSeek]

[PDF] aaai.org

Red-Teaming for generative AI: Silver bullet or security theater?

M Feffer, A Sinha, WH Deng, ZC Lipton… - Proceedings of the AAAI …, 2024 - ojs.aaai.org

In response to rising concerns surrounding the safety, security, and trustworthiness of
Generative AI (GenAI) models, practitioners and regulators alike have pointed to AI red …

Зберегти Послатися Цитовано в 42 джерелах Пов’язані статті Кількість версій: 5 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Chatmusician: Understanding and generating music intrinsically with llm

R Yuan, H Lin, Y Wang, Z Tian, S Wu, T Shen… - arxiv preprint arxiv …, 2024 - arxiv.org

While Large Language Models (LLMs) demonstrate impressive capabilities in text
generation, we find that their ability has yet to be generalized to music, humanity's creative …

Зберегти Послатися Цитовано в 36 джерелах Пов’язані статті Кількість версій: 6 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Masked audio generation using a single non-autoregressive transformer

A Ziv, I Gat, GL Lan, T Remez, F Kreuk… - arxiv preprint arxiv …, 2024 - arxiv.org

We introduce MAGNeT, a masked generative sequence modeling method that operates
directly over several streams of audio tokens. Unlike prior work, MAGNeT is comprised of a …

Зберегти Послатися Цитовано в 35 джерелах Пов’язані статті Кількість версій: 4 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Multi-source diffusion models for simultaneous music generation and separation

G Mariani, I Tallini, E Postolache, M Mancusi… - arxiv preprint arxiv …, 2023 - arxiv.org

In this work, we define a diffusion-based generative model capable of both music synthesis
and source separation by learning the score of the joint probability density of sources …

Зберегти Послатися Цитовано в 40 джерелах Пов’язані статті Кількість версій: 5 Показати у форматі HTML

Створити сповіщення

Послатися

Розширений пошук

Збережено в моїй бібліотеці

Singsong: Generating musical accompaniments from singing

Sparks of large audio models: A survey and outlook

Simple and controllable music generation

Foundation models for music: A survey

Textually pretrained speech language models

Soundstorm: Efficient parallel audio generation

Music controlnet: Multiple time-varying controls for music generation

Red-Teaming for generative AI: Silver bullet or security theater?

Chatmusician: Understanding and generating music intrinsically with llm

Masked audio generation using a single non-autoregressive transformer

Multi-source diffusion models for simultaneous music generation and separation