- Academic Search

Y Ma, A Øland, A Ragni, BMS Del Sette, C Saitis… - arxiv preprint arxiv …, 2024 - arxiv.org

In recent years, foundation models (FMs) such as large language models (LLMs) and latent
diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This …

Simpan Kutip Dirujuk 12 kali Artikel terkait 4 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

MUGen: Multi-modal Music Understanding and Generation with the Power of Large Language Models

S Liu, AS Hussain, C Sun, Y Shan - arxiv preprint arxiv:2311.11255, 2023 - arxiv.org

The current landscape of research leveraging large language models (LLMs) is
experiencing a surge. Many works harness the powerful reasoning capabilities of these …

Simpan Kutip Dirujuk 34 kali Artikel terkait 2 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Zero-shot unsupervised and text-based audio editing using DDPM inversion

H Manor, T Michaeli - arxiv preprint arxiv:2402.10009, 2024 - arxiv.org

Editing signals using large pre-trained models, in a zero-shot manner, has recently seen
rapid advancements in the image domain. However, this wave has yet to reach the audio …

Simpan Kutip Dirujuk 18 kali Artikel terkait 6 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Musicmagus: Zero-shot text-to-music editing via diffusion models

Y Zhang, Y Ikemiya, G **a, N Murata… - arxiv preprint arxiv …, 2024 - arxiv.org

Recent advances in text-to-music generation models have opened new avenues in musical
creativity. However, music generation usually involves iterative refinements, and how to edit …

Simpan Kutip Dirujuk 19 kali Artikel terkait 6 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Loop copilot: Conducting ai ensembles for music generation and iterative editing

Y Zhang, A Maezawa, G **a, K Yamamoto… - arxiv preprint arxiv …, 2023 - arxiv.org

Creating music is iterative, requiring varied methods at each stage. However, existing AI
music systems fall short in orchestrating multiple subsystems for diverse needs. To address …

Simpan Kutip Dirujuk 17 kali Artikel terkait 3 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

Instructspeech: Following speech editing instructions via large language models

R Huang, R Hu, Y Wang, Z Wang, X Cheng… - … on Machine Learning, 2024 - openreview.net

Instruction-guided speech editing aims to follow the user's natural language instruction to
manipulate the semantic and acoustic attributes of a speech. In this work, we construct triplet …

Simpan Kutip Dirujuk 2 kali Artikel terkait 5 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Cocola: Coherence-oriented contrastive learning of musical audio representations

R Ciranni, G Mariani, M Mancusi, E Postolache… - arxiv preprint arxiv …, 2024 - arxiv.org

We present COCOLA (Coherence-Oriented Contrastive Learning for Audio), a contrastive
learning method for musical audio representations that captures the harmonic and rhythmic …

Simpan Kutip Dirujuk 5 kali Artikel terkait 2 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Generalized multi-source inference for text conditioned music diffusion models

E Postolache, G Mariani, L Cosmo… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org

Multi-Source Diffusion Models (MSDM) allow for compositional musical generation tasks:
generating a set of coherent sources, creating accompaniments, and performing source …

Simpan Kutip Dirujuk 9 kali Artikel terkait 7 versi

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Instruction-guided editing controls for images and multimedia: A survey in llm era

TT Nguyen, Z Ren, T Pham, PL Nguyen, H Yin… - arxiv preprint arxiv …, 2024 - arxiv.org

The rapid advancement of large language models (LLMs) and multimodal learning has
transformed digital content creation and manipulation. Traditional visual editing tools require …

Simpan Kutip Dirujuk 2 kali Artikel terkait 2 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

St-ito: Controlling audio effects for style transfer with inference-time optimization

CJ Steinmetz, S Singh, M Comunità, I Ibnyahya… - arxiv preprint arxiv …, 2024 - arxiv.org

Audio production style transfer is the task of processing an input to impart stylistic elements
from a reference recording. Existing approaches often train a neural network to estimate …

Simpan Kutip Dirujuk 4 kali Artikel terkait 6 versi Versi HTML

Buat notifikasi

Kutip

Penelusuran lanjutan

Disimpan ke Koleksi saya

Instructme: An instruction guided music edit and remix framework with latent diffusion models

Foundation models for music: A survey

MUGen: Multi-modal Music Understanding and Generation with the Power of Large Language Models

Zero-shot unsupervised and text-based audio editing using DDPM inversion

Musicmagus: Zero-shot text-to-music editing via diffusion models

Loop copilot: Conducting ai ensembles for music generation and iterative editing

Instructspeech: Following speech editing instructions via large language models

Cocola: Coherence-oriented contrastive learning of musical audio representations

Generalized multi-source inference for text conditioned music diffusion models

Instruction-guided editing controls for images and multimedia: A survey in llm era

St-ito: Controlling audio effects for style transfer with inference-time optimization