- Academic Search

X Tan, T Qin, F Soong, TY Liu - arxiv preprint arxiv:2106.15561, 2021 - arxiv.org

Text to speech (TTS), or speech synthesis, which aims to synthesize intelligible and natural
speech given text, is a hot research topic in speech, language, and machine learning …

Zapisz Cytuj Cytowane przez 471 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Nautilus: a versatile voice cloning system

HT Luong, J Yamagishi - IEEE/ACM Transactions on Audio …, 2020 - ieeexplore.ieee.org

We introduce a novel speech synthesis system, called NAUTILUS, that can generate speech
with a target voice either from a text input or a reference utterance of an arbitrary source …

Zapisz Cytuj Cytowane przez 63 Powiązane artykuły Wszystkie wersje 5

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The sequence-to-sequence baseline for the voice conversion challenge 2020: Cascading asr and tts

WC Huang, T Hayashi, S Watanabe, T Toda - arxiv preprint arxiv …, 2020 - arxiv.org

This paper presents the sequence-to-sequence (seq2seq) baseline system for the voice
conversion challenge (VCC) 2020. We consider a naive approach for voice conversion (VC) …

Zapisz Cytuj Cytowane przez 48 Powiązane artykuły Wszystkie wersje 8 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

USAT: A Universal Speaker-Adaptive Text-to-Speech Approach

W Wang, Y Song, S Jha - IEEE/ACM Transactions on Audio …, 2024 - ieeexplore.ieee.org

Conventional text-to-speech (TTS) research has predominantly focused on enhancing the
quality of synthesized speech for speakers in the training dataset. The challenge of …

Zapisz Cytuj Cytowane przez 11 Powiązane artykuły Wszystkie wersje 3

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

QS-TTS: Towards Semi-Supervised Text-to-Speech Synthesis via Vector-Quantized Self-Supervised Speech Representation Learning

H Guo, F **e, J Kang, Y **ao, X Wu… - IEEE/ACM Transactions …, 2024 - ieeexplore.ieee.org

This paper proposes a novel semi-supervised TTS framework, QS-TTS, to improve TTS
quality with lower supervised data requirements via Vector-Quantized Self-Supervised …

Zapisz Cytuj Cytowane przez 3 Powiązane artykuły Wszystkie wersje 2

[Free GPT-4]
[DeepSeek]

[PDF] isca-archive.org

[PDF][PDF] Cross-Lingual Speaker Adaptation Using Domain Adaptation and Speaker Consistency Loss for Text-To-Speech Synthesis.

D **n, Y Saito, S Takamichi, T Koriyama… - Interspeech, 2021 - isca-archive.org

We present a cross-lingual speaker adaptation method based on domain adaptation and a
speaker consistency loss for text-tospeech (TTS) synthesis. Existing monolingual speaker …

Zapisz Cytuj Cytowane przez 15 Powiązane artykuły Wszystkie wersje 6 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Advancing Accessibility: Voice Cloning and Speech Synthesis for Individuals with Speech Disorders

LD Anand, DJ Reji - arxiv preprint arxiv:2401.11771, 2024 - arxiv.org

Neural Text-to-speech (TTS) synthesis is a powerful technology that can generate speech
using neural networks. One of the most remarkable features of TTS synthesis is its capability …

Zapisz Cytuj Cytowane przez 6 Powiązane artykuły Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

On prosody modeling for ASR+ TTS based voice conversion

WC Huang, T Hayashi, X Li… - 2021 IEEE Automatic …, 2021 - ieeexplore.ieee.org

In voice conversion (VC), an approach showing promising results in the latest voice
conversion challenge (VCC) 2020 is to first use an automatic speech recognition (ASR) …

Zapisz Cytuj Cytowane przez 9 Powiązane artykuły Wszystkie wersje 6

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

GC-TTS: Few-shot speaker adaptation with geometric constraints

JH Kim, SH Lee, JH Lee, HG Jung… - 2021 IEEE International …, 2021 - ieeexplore.ieee.org

Few-shot speaker adaptation is a specific Text-to-Speech (TTS) system that aims to
reproduce a novel speaker's voice with a few training data. While numerous attempts have …

Zapisz Cytuj Cytowane przez 8 Powiązane artykuły Wszystkie wersje 5

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Speaker verification-derived loss and data augmentation for DNN-based multispeaker speech synthesis

B Lőrincz, A Stan, M Giurgiu - 2021 29th European Signal …, 2021 - ieeexplore.ieee.org

Building multispeaker neural network-based text-to-speech synthesis systems commonly
relies on the availability of large amounts of high quality recordings from each speaker and …

Zapisz Cytuj Cytowane przez 6 Powiązane artykuły Wszystkie wersje 8

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

Semi-supervised speaker adaptation for end-to-end speech synthesis with pretrained models

A survey on neural speech synthesis

Nautilus: a versatile voice cloning system

The sequence-to-sequence baseline for the voice conversion challenge 2020: Cascading asr and tts

USAT: A Universal Speaker-Adaptive Text-to-Speech Approach

QS-TTS: Towards Semi-Supervised Text-to-Speech Synthesis via Vector-Quantized Self-Supervised Speech Representation Learning

[PDF][PDF] Cross-Lingual Speaker Adaptation Using Domain Adaptation and Speaker Consistency Loss for Text-To-Speech Synthesis.

Advancing Accessibility: Voice Cloning and Speech Synthesis for Individuals with Speech Disorders

On prosody modeling for ASR+ TTS based voice conversion

GC-TTS: Few-shot speaker adaptation with geometric constraints

Speaker verification-derived loss and data augmentation for DNN-based multispeaker speech synthesis