Google Akademik

B Sisman, J Yamagishi, S King… - IEEE/ACM Transactions …, 2020 - ieeexplore.ieee.org

Speaker identity is one of the important characteristics of human speech. In voice
conversion, we change the speaker identity from one to another, while kee** the linguistic …

Kaydet Alıntı yap Alıntılanma sayısı: 421 İlgili makaleler 8 sürümün hepsi

[Free GPT-4]

[PDF] arxiv.org

A comprehensive survey on deep music generation: Multi-level representations, algorithms, evaluations, and future directions

S Ji, J Luo, X Yang - arxiv preprint arxiv:2011.06801, 2020 - arxiv.org

The utilization of deep learning techniques in generating various contents (such as image,
text, etc.) has become a trend. Especially music, the topic of this paper, has attracted …

Kaydet Alıntı yap Alıntılanma sayısı: 193 İlgili makaleler 3 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]

[PDF] arxiv.org

A survey on neural speech synthesis

X Tan, T Qin, F Soong, TY Liu - arxiv preprint arxiv:2106.15561, 2021 - arxiv.org

Text to speech (TTS), or speech synthesis, which aims to synthesize intelligible and natural
speech given text, is a hot research topic in speech, language, and machine learning …

Kaydet Alıntı yap Alıntılanma sayısı: 467 İlgili makaleler 2 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]

[PDF] arxiv.org

Naturalspeech: End-to-end text-to-speech synthesis with human-level quality

X Tan, J Chen, H Liu, J Cong, C Zhang… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Text-to-speech (TTS) has made rapid progress in both academia and industry in recent
years. Some questions naturally arise that whether a TTS system can achieve human-level …

Kaydet Alıntı yap Alıntılanma sayısı: 223 İlgili makaleler 9 sürümün hepsi

[Free GPT-4]

[PDF] arxiv.org

Speecht5: Unified-modal encoder-decoder pre-training for spoken language processing

J Ao, R Wang, L Zhou, C Wang, S Ren, Y Wu… - arxiv preprint arxiv …, 2021 - arxiv.org

Motivated by the success of T5 (Text-To-Text Transfer Transformer) in pre-trained natural
language processing models, we propose a unified-modal SpeechT5 framework that …

Kaydet Alıntı yap Alıntılanma sayısı: 249 İlgili makaleler 6 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]

[PDF] arxiv.org

A comparative study on transformer vs rnn in speech applications

S Karita, N Chen, T Hayashi, T Hori… - 2019 IEEE automatic …, 2019 - ieeexplore.ieee.org

Sequence-to-sequence models have been widely used in end-to-end speech processing,
for example, automatic speech recognition (ASR), speech translation (ST), and text-to …

Kaydet Alıntı yap Alıntılanma sayısı: 897 İlgili makaleler 10 sürümün hepsi

[Free GPT-4]

[PDF] researchgate.net

Attention, please! A survey of neural attention models in deep learning

A de Santana Correia, EL Colombini - Artificial Intelligence Review, 2022 - Springer

In humans, Attention is a core property of all perceptual and cognitive operations. Given our
limited ability to process competing sources, attention mechanisms select, modulate, and …

Kaydet Alıntı yap Alıntılanma sayısı: 227 İlgili makaleler 8 sürümün hepsi

[Free GPT-4]

[PDF] arxiv.org

ESPnet-TTS: Unified, reproducible, and integratable open source end-to-end text-to-speech toolkit

T Hayashi, R Yamamoto, K Inoue… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org

This paper introduces a new end-to-end text-to-speech (E2E-TTS) toolkit named ESPnet-
TTS, which is an extension of the open-source speech processing toolkit ESPnet. The toolkit …

Kaydet Alıntı yap Alıntılanma sayısı: 246 İlgili makaleler 7 sürümün hepsi

[Free GPT-4]

[PDF] acm.org

Moglow: Probabilistic and controllable motion synthesis using normalising flows

GE Henter, S Alexanderson, J Beskow - ACM Transactions on Graphics …, 2020 - dl.acm.org

Data-driven modelling and synthesis of motion is an active research area with applications
that include animation, games, and social robotics. This paper introduces a new class of …

Kaydet Alıntı yap Alıntılanma sayısı: 216 İlgili makaleler 7 sürümün hepsi

[Free GPT-4]

[PDF] arxiv.org

Towards automatic face-to-face translation

P KR, R Mukhopadhyay, J Philip, A Jha… - Proceedings of the 27th …, 2019 - dl.acm.org

In light of the recent breakthroughs in automatic machine translation systems, we propose a
novel approach that we term as" Face-to-Face Translation". As today's digital communication …

Kaydet Alıntı yap Alıntılanma sayısı: 209 İlgili makaleler 10 sürümün hepsi

Uyarı oluştur

Alıntı yap

Gelişmiş arama

Kitaplığım'a kaydedildi

Efficiently trainable text-to-speech system based on deep convolutional networks with guided...

An overview of voice conversion and its challenges: From statistical modeling to deep learning

A comprehensive survey on deep music generation: Multi-level representations, algorithms, evaluations, and future directions

A survey on neural speech synthesis

Naturalspeech: End-to-end text-to-speech synthesis with human-level quality

Speecht5: Unified-modal encoder-decoder pre-training for spoken language processing

A comparative study on transformer vs rnn in speech applications

Attention, please! A survey of neural attention models in deep learning

ESPnet-TTS: Unified, reproducible, and integratable open source end-to-end text-to-speech toolkit

Moglow: Probabilistic and controllable motion synthesis using normalising flows

Towards automatic face-to-face translation