- Academic Search

N Gengembre, O Le Blouch… - Proc. Interspeech …, 2024 - anr-eva.gitlabpages.inria.fr

Modern voice conversion and anonymization architectures generally share a design
preserving source linguistic content and expressivity while modifying speaker timbre …

Speichern Zitieren Zitiert von: 2 Ähnliche Artikel Alle 3 Versionen HTML-Version

Text-to-Speech With Lip Synchronization Based on Speech-Assisted Text-to-Video Alignment and Masked Unit Prediction

Y Ahn, J Chae, JW Shin - IEEE Signal Processing Letters, 2025 - ieeexplore.ieee.org

Text-to-speech (TTS) with lip synchronization (TTSLS) is the task of generating a speech
signal synchronized with the lip movements in a video given the text transcription and the …

Speichern Zitieren Ähnliche Artikel

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for Text-to-Speech Speaker Adaptation

R Fu, X Qi, Z Wen, J Tao, T Wang, C Qiang… - arxiv preprint arxiv …, 2024 - arxiv.org

Speaker adaptation, which involves cloning voices from unseen speakers in the Text-to-
Speech task, has garnered significant interest due to its numerous applications in multi …

Speichern Zitieren Ähnliche Artikel Alle 3 Versionen HTML-Version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Latent Filling: Latent Space Data Augmentation for Zero-Shot Speech Synthesis

JS Bae, JY Lee, JH Lee, S Mun, T Kang… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org

Previous works in zero-shot text-to-speech (ZS-TTS) have attempted to enhance its systems
by enlarging the training data through crowd-sourcing or augmenting existing speech data …

Speichern Zitieren Zitiert von: 3 Ähnliche Artikel Alle 3 Versionen

Synthesis and Restoration of Traditional Ethnic Musical Instrument Timbres Based on Time-Frequency Analysis.

M Chen, Y **ang, C **ong - Traitement du Signal, 2024 - search.ebscohost.com

With the advent of the digital age, the preservation and restoration of the timbres of
traditional ethnic musical instruments have emerged as significant areas of study in …

Speichern Zitieren Ähnliche Artikel

Alert erstellen

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

Hierarchical timbre-cadence speaker encoder for zero-shot speech synthesis

[PDF][PDF] Disentangling prosody and timbre embeddings via voice conversion

Text-to-Speech With Lip Synchronization Based on Speech-Assisted Text-to-Video Alignment and Masked Unit Prediction

ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for Text-to-Speech Speaker Adaptation

Latent Filling: Latent Space Data Augmentation for Zero-Shot Speech Synthesis

Synthesis and Restoration of Traditional Ethnic Musical Instrument Timbres Based on Time-Frequency Analysis.