Google Академик

SH Mohammadi, A Kain - Speech Communication, 2017 - Elsevier

Voice transformation (VT) aims to change one or more aspects of a speech signal while
preserving linguistic information. A subset of VT, Voice conversion (VC) specifically aims to …

Сачувај Цитирај 352 пута наведен Сродни чланци Све верзије (5)

[Free GPT-4]
[DeepSeek]

[PDF] worldscientific.com

A review on human-computer interaction and intelligent robots

F Ren, Y Bao - International Journal of Information Technology & …, 2020 - World Scientific

In the field of artificial intelligence, human–computer interaction (HCI) technology and its
related intelligent robot technologies are essential and interesting contents of research …

Сачувај Цитирај 157 пута наведен Сродни чланци Све верзије (11)

[Free GPT-4]
[DeepSeek]

[PDF] sciencedirect.com

ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech

X Wang, J Yamagishi, M Todisco, H Delgado… - Computer Speech & …, 2020 - Elsevier

Automatic speaker verification (ASV) is one of the most natural and convenient means of
biometric person recognition. Unfortunately, just like all other biometric systems, ASV is …

Сачувај Цитирај 435 пута наведен Сродни чланци Све верзије (14)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Tacotron: Towards end-to-end speech synthesis

Y Wang, RJ Skerry-Ryan, D Stanton, Y Wu… - arxiv preprint arxiv …, 2017 - arxiv.org

A text-to-speech synthesis system typically consists of multiple stages, such as a text
analysis frontend, an acoustic model and an audio synthesis module. Building these …

Сачувај Цитирај 2308 пута наведен Сродни чланци Све верзије (7) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] academia.edu

[PDF][PDF] Wavenet: A generative model for raw audio

A Van Den Oord, S Dieleman, H Zen… - arxiv preprint arxiv …, 2016 - academia.edu

This paper introduces WaveNet, a deep neural network for generating raw audio waveforms.
The model is fully probabilistic and autoregressive, with the predictive distribution for each …

Сачувај Цитирај 6082 пута наведен Сродни чланци Све верзије (9) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Wavenet: A generative model for raw audio

A Oord, S Dieleman, H Zen, K Simonyan… - arxiv preprint arxiv …, 2016 - arxiv.org

This paper introduces WaveNet, a deep neural network for generating raw audio waveforms.
The model is fully probabilistic and autoregressive, with the predictive distribution for each …

Сачувај Цитирај 1925 пута наведен Сродни чланци Све верзије (2) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Deep voice 3: Scaling text-to-speech with convolutional sequence learning

W **, K Peng, A Gibiansky, SO Arik… - arxiv preprint arxiv …, 2017 - arxiv.org

We present Deep Voice 3, a fully-convolutional attention-based neural text-to-speech (TTS)
system. Deep Voice 3 matches state-of-the-art neural speech synthesis systems in …

Сачувај Цитирај 571 пута наведен Сродни чланци Све верзије (5) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] jst.go.jp

WORLD: a vocoder-based high-quality speech synthesis system for real-time applications

M Morise, F Yokomori, K Ozawa - IEICE TRANSACTIONS on …, 2016 - search.ieice.org

A vocoder-based speech synthesis system, named WORLD, was developed in an effort to
improve the sound quality of real-time applications using speech. Speech analysis …

Сачувај Цитирај 1539 пута наведен Сродни чланци Све верзије (11)

[Free GPT-4]
[DeepSeek]

[PDF] abracadoudou.com

[PDF][PDF] Tacotron: A fully end-to-end text-to-speech synthesis model

Y Wang, RJ Skerry-Ryan… - arxiv preprint …, 2017 - bengio.abracadoudou.com

ABSTRACT A text-to-speech synthesis system typically consists of multiple stages, such as a
text analysis frontend, an acoustic model and an audio synthesis module. Building these …

Сачувај Цитирај 298 пута наведен Сродни чланци Све верзије (5) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] audentia-gestion.fr

Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis

H Zen, H Sak - … Conference on Acoustics, Speech and Signal …, 2015 - ieeexplore.ieee.org

Long short-term memory recurrent neural networks (LSTM-RNNs) have been applied to
various speech applications including acoustic modeling for statistical parametric speech …

Сачувај Цитирај 393 пута наведен Сродни чланци Све верзије (10)

Направи обавештење

Цитирај

Напредна претрага

Сачувано у мојој библиотеци

Vocaine the vocoder and applications in speech synthesis

An overview of voice conversion systems

A review on human-computer interaction and intelligent robots

ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech

Tacotron: Towards end-to-end speech synthesis

[PDF][PDF] Wavenet: A generative model for raw audio

Wavenet: A generative model for raw audio

Deep voice 3: Scaling text-to-speech with convolutional sequence learning

WORLD: a vocoder-based high-quality speech synthesis system for real-time applications

[PDF][PDF] Tacotron: A fully end-to-end text-to-speech synthesis model

Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis