Google Академія

A Mohamed, H Lee, L Borgholt… - IEEE Journal of …, 2022 - ieeexplore.ieee.org

Although supervised deep learning has revolutionized speech and audio processing, it has
necessitated the building of specialist models for individual tasks and application scenarios …

Зберегти Послатися Цитовано в 409 джерелах Пов’язані статті Кількість версій: 10

A comprehensive survey and analysis of generative models in machine learning

GM Harshvardhan, MK Gourisaria, M Pandey… - Computer Science …, 2020 - Elsevier

Generative models have been in existence for many decades. In the field of machine
learning, we come across many scenarios when directly learning a target is intractable …

Зберегти Послатися Цитовано в 494 джерелах Пов’язані статті Кількість версій: 2

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Qwen-audio: Advancing universal audio understanding via unified large-scale audio-language models

Y Chu, J Xu, X Zhou, Q Yang, S Zhang, Z Yan… - ar** architectures suitable for modeling raw audio is a challenging problem due to
the high sampling rates of audio waveforms. Standard sequence modeling approaches like …

Зберегти Послатися Цитовано в 224 джерелах Пов’язані статті Кількість версій: 4 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] pubpub.org

[PDF][PDF] Jukebox: A generative model for music

P Dhariwal, H Jun, C Payne, JW Kim… - arxiv preprint arxiv …, 2020 - assets.pubpub.org

We introduce Jukebox, a model that generates music with singing in the raw audio domain.
We tackle the long context of raw audio using a multiscale VQ-VAE to compress it to discrete …

Зберегти Послатися Цитовано в 914 джерелах Пов’язані статті Кількість версій: 7 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Melgan: Generative adversarial networks for conditional waveform synthesis

K Kumar, R Kumar, T De Boissiere… - Advances in neural …, 2019 - proceedings.neurips.cc

Previous works (Donahue et al., 2018a; Engel et al., 2019a) have found that generating
coherent raw audio waveforms with GANs is challenging. In this paper, we show that it is …

Зберегти Послатися Цитовано в 1184 джерелах Пов’язані статті Кількість версій: 10 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Mert: Acoustic music understanding model with large-scale self-supervised training

Y Li, R Yuan, G Zhang, Y Ma, X Chen, H Yin… - arxiv preprint arxiv …, 2023 - arxiv.org

Self-supervised learning (SSL) has recently emerged as a promising paradigm for training
generalisable models on large-scale data in the fields of vision, text, and speech. Although …

Зберегти Послатися Цитовано в 97 джерелах Пов’язані статті Кількість версій: 7 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

DDSP: Differentiable digital signal processing

J Engel, L Hantrakul, C Gu, A Roberts - arxiv preprint arxiv:2001.04643, 2020 - arxiv.org

Most generative models of audio directly generate samples in one of two domains: time or
frequency. While sufficient to express any signal, these representations are inefficient, as …

Зберегти Послатися Цитовано в 527 джерелах Пов’язані статті Кількість версій: 5 Показати у форматі HTML

Створити сповіщення

Послатися

Розширений пошук

Збережено в моїй бібліотеці

Neural audio synthesis of musical notes with wavenet autoencoders

Self-supervised speech representation learning: A review

A comprehensive survey and analysis of generative models in machine learning

Qwen-audio: Advancing universal audio understanding via unified large-scale audio-language models

[PDF][PDF] Jukebox: A generative model for music

Melgan: Generative adversarial networks for conditional waveform synthesis

Mert: Acoustic music understanding model with large-scale self-supervised training

DDSP: Differentiable digital signal processing