- Academic Search

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier

The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

Spara Citera Citerat av 242 Relaterade artiklar Alla 7 versionerna

[Free GPT-4]
[DeepSeek]

[PDF] jmlr.org

Normalizing flows for probabilistic modeling and inference

G Papamakarios, E Nalisnick, DJ Rezende… - Journal of Machine …, 2021 - jmlr.org

Normalizing flows provide a general mechanism for defining expressive probability
distributions, only requiring the specification of a (usually simple) base distribution and a …

Spara Citera Citerat av 1969 Relaterade artiklar Alla 8 versionerna Se som HTML-version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Bigvgan: A universal neural vocoder with large-scale training

S Lee, W **, B Ginsburg, B Catanzaro… - ar** architectures suitable for modeling raw audio is a challenging problem due to
the high sampling rates of audio waveforms. Standard sequence modeling approaches like …

Spara Citera Citerat av 224 Relaterade artiklar Alla 4 versionerna Se som HTML-version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A survey on neural speech synthesis

X Tan, T Qin, F Soong, TY Liu - arxiv preprint arxiv:2106.15561, 2021 - arxiv.org

Text to speech (TTS), or speech synthesis, which aims to synthesize intelligible and natural
speech given text, is a hot research topic in speech, language, and machine learning …

Spara Citera Citerat av 467 Relaterade artiklar Alla 2 versionerna Se som HTML-version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Diffwave: A versatile diffusion model for audio synthesis

Z Kong, W **, J Huang, K Zhao… - arxiv preprint arxiv …, 2020 - arxiv.org

In this work, we propose DiffWave, a versatile diffusion probabilistic model for conditional
and unconditional waveform generation. The model is non-autoregressive, and converts the …

Spara Citera Citerat av 1471 Relaterade artiklar Alla 4 versionerna Se som HTML-version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Wavegrad: Estimating gradients for waveform generation

N Chen, Y Zhang, H Zen, RJ Weiss, M Norouzi… - arxiv preprint arxiv …, 2020 - arxiv.org

This paper introduces WaveGrad, a conditional model for waveform generation which
estimates gradients of the data density. The model is built on prior work on score matching …

Spara Citera Citerat av 837 Relaterade artiklar Alla 6 versionerna Se som HTML-version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Fastspeech 2: Fast and high-quality end-to-end text to speech

Y Ren, C Hu, X Tan, T Qin, S Zhao, Z Zhao… - arxiv preprint arxiv …, 2020 - arxiv.org

Non-autoregressive text to speech (TTS) models such as FastSpeech can synthesize
speech significantly faster than previous autoregressive models with comparable quality …

Spara Citera Citerat av 1610 Relaterade artiklar Alla 3 versionerna Se som HTML-version

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Glow-tts: A generative flow for text-to-speech via monotonic alignment search

J Kim, S Kim, J Kong, S Yoon - Advances in Neural …, 2020 - proceedings.neurips.cc

Abstract Recently, text-to-speech (TTS) models such as FastSpeech and ParaNet have been
proposed to generate mel-spectrograms from text in parallel. Despite the advantage, the …

Spara Citera Citerat av 572 Relaterade artiklar Alla 5 versionerna Se som HTML-version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Normalizing flows: An introduction and review of current methods

I Kobyzev, SJD Prince… - IEEE transactions on …, 2020 - ieeexplore.ieee.org

Normalizing Flows are generative models which produce tractable distributions where both
sampling and density evaluation can be efficient and exact. The goal of this survey article is …

Spara Citera Citerat av 1605 Relaterade artiklar Alla 10 versionerna

Skapa alarm

Citera

Avancerad sökning

Har sparats i Mitt bibliotek

FloWaveNet: A generative flow for raw audio

A review of deep learning techniques for speech processing

Normalizing flows for probabilistic modeling and inference

Bigvgan: A universal neural vocoder with large-scale training

A survey on neural speech synthesis

Diffwave: A versatile diffusion model for audio synthesis

Wavegrad: Estimating gradients for waveform generation

Fastspeech 2: Fast and high-quality end-to-end text to speech

Glow-tts: A generative flow for text-to-speech via monotonic alignment search

Normalizing flows: An introduction and review of current methods