Google Наука

Z Evans, JD Parker, CJ Carr, Z Zukowski… - arxiv preprint arxiv …, 2024 - arxiv.org

Audio-based generative models for music have seen great strides recently, but so far have
not managed to produce full-length music tracks with coherent musical structure from text …

Запазване Позоваване С позовавания в 38 Сродни статии Всички 4 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Wavchat: A survey of spoken dialogue models

S Ji, Y Chen, M Fang, J Zuo, J Lu, H Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

Recent advancements in spoken dialogue models, exemplified by systems like GPT-4o,
have captured significant attention in the speech domain. Compared to traditional three-tier …

Запазване Позоваване С позовавания в 11 Сродни статии Всички 2 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Enhancing zero-shot text-to-speech synthesis with human feedback

C Chen, Y Hu, W Wu, H Wang, ES Chng… - arxiv preprint arxiv …, 2024 - arxiv.org

In recent years, text-to-speech (TTS) technology has witnessed impressive advancements,
particularly with large-scale training datasets, showcasing human-level speech quality and …

Запазване Позоваване С позовавания в 9 Сродни статии Всички 3 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Emo-dpo: Controllable emotional speech synthesis through direct preference optimization

X Gao, C Zhang, Y Chen, H Zhang, NF Chen - arxiv preprint arxiv …, 2024 - arxiv.org

Current emotional text-to-speech (TTS) models predominantly conduct supervised training
to learn the conversion from text and desired emotion to its emotional speech, focusing on a …

Запазване Позоваване С позовавания в 4 Сродни статии Всички 2 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] mdpi.com

Crafting Creative Melodies: A User-Centric Approach for Symbolic Music Generation

S Dadman, BA Bremdal - Electronics, 2024 - mdpi.com

Composing coherent and structured music is one of the main challenges in symbolic music
generation. Our research aims to propose a user-centric framework design that promotes a …

Запазване Позоваване С позовавания в 3 Сродни статии Всички 7 версии Кеширана версия

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Seed-music: A unified framework for high quality and controlled music generation

Y Bai, H Chen, J Chen, Z Chen, Y Deng… - arxiv preprint arxiv …, 2024 - arxiv.org

We introduce Seed-Music, a suite of music generation systems capable of producing high-
quality music with fine-grained style control. Our unified framework leverages both auto …

Запазване Позоваване С позовавания в 2 Сродни статии Всички 2 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

MusicScore: A Dataset for Music Score Modeling and Generation

Y Lin, Z Dai, Q Kong - arxiv preprint arxiv:2406.11462, 2024 - arxiv.org

Music scores are written representations of music and contain rich information about musical
components. The visual information on music scores includes notes, rests, staff lines, clefs …

Запазване Позоваване С позовавания в 2 Сродни статии Всички 3 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Dynamic normativity: Necessary and sufficient conditions for value alignment

NK Corrêa - arxiv preprint arxiv:2406.11039, 2024 - arxiv.org

The critical inquiry pervading the realm of Philosophy, and perhaps extending its influence
across all Humanities disciplines, revolves around the intricacies of morality and normativity …

Запазване Позоваване С позовавания в 2 Сродни статии Всички 3 версии Във вид на HTML

Video Echoed in Harmony: Learning and Sampling Video-Integrated Chord Progression Sequences for Controllable Video Background Music Generation

X Tong, S Chen, P Yu, N Liu, H Qv, T Ma… - IEEE Transactions …, 2024 - ieeexplore.ieee.org

Automatically generating video background music mitigates the inefficiency and time-
consuming drawbacks of current manual video editing. Two key challenges hinder the …

Запазване Позоваване С позовавания в 1 Сродни статии

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Bailing-TTS: Chinese Dialectal Speech Synthesis Towards Human-like Spontaneous Representation

X Di, Z Chen, Y Liang, J Zheng, Y Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

Large-scale text-to-speech (TTS) models have made significant progress recently. However,
they still fall short in the generation of Chinese dialectal speech. Toaddress this, we propose …

Запазване Позоваване С позовавания в 1 Сродни статии Всички 2 версии Във вид на HTML

Създаване на сигнал

Позоваване

Разширено търсене

Запазено в „Моята библиотека“

Musicrl: Aligning music generation to human preferences

Long-form music generation with latent diffusion

Wavchat: A survey of spoken dialogue models

Enhancing zero-shot text-to-speech synthesis with human feedback

Emo-dpo: Controllable emotional speech synthesis through direct preference optimization

Crafting Creative Melodies: A User-Centric Approach for Symbolic Music Generation

Seed-music: A unified framework for high quality and controlled music generation

MusicScore: A Dataset for Music Score Modeling and Generation

Dynamic normativity: Necessary and sufficient conditions for value alignment

Video Echoed in Harmony: Learning and Sampling Video-Integrated Chord Progression Sequences for Controllable Video Background Music Generation

Bailing-TTS: Chinese Dialectal Speech Synthesis Towards Human-like Spontaneous Representation