- Academic Search

Deep learning-based expressive speech synthesis: a systematic review of approaches, challenges, and resources

H Barakat, O Turk, C Demiroglu - EURASIP Journal on Audio, Speech, and …, 2024 - Springer

Speech synthesis has made significant strides thanks to the transition from machine learning
to deep learning models. Contemporary text-to-speech (TTS) models possess the capability …

Lưu Trích dẫn Trích dẫn 11 bài viết Bài viết có liên quan Tất cả 8 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Base tts: Lessons from building a billion-parameter text-to-speech model on 100k hours of data

M Łajszczak, G Cámbara, Y Li, F Beyhan… - arxiv preprint arxiv …, 2024 - arxiv.org

We introduce a text-to-speech (TTS) model called BASE TTS, which stands for $\textbf {B} $
ig $\textbf {A} $ daptive $\textbf {S} $ treamable TTS with $\textbf {E} $ mergent abilities …

Lưu Trích dẫn Trích dẫn 69 bài viết Bài viết có liên quan Tất cả 6 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Not my voice! a taxonomy of ethical and safety harms of speech generators

W Hutiri, O Papakyriakopoulos, A **ang - Proceedings of the 2024 ACM …, 2024 - dl.acm.org

The rapid and wide-scale adoption of AI to generate human speech poses a range of
significant ethical and safety risks to society that need to be addressed. For example, a …

Lưu Trích dẫn Trích dẫn 18 bài viết Bài viết có liên quan Tất cả 4 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Slim: Style-linguistics mismatch model for generalized audio deepfake detection

Y Zhu, S Koppisetti, T Tran… - Advances in Neural …, 2025 - proceedings.neurips.cc

Audio deepfake detection (ADD) is crucial to combat the misuse of speech synthesized by
generative AI models. Existing ADD models suffer from generalization issues to unseen …

Lưu Trích dẫn Trích dẫn 5 bài viết Bài viết có liên quan Tất cả 5 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] science.org

Beyond Deep Learning: Charting the Next Frontiers of Affective Computing

A Triantafyllopoulos, L Christ, A Gebhard… - Intelligent …, 2024 - spj.science.org

Affective computing (AC), like most other areas of computational research, has benefited
tremendously from advances in deep learning (DL). These advances have opened up new …

Lưu Trích dẫn Trích dẫn 1 bài viết Bài viết có liên quan Tất cả 3 phiên bản

Improved dendritic learning: Activation function analysis

Y Wang, Y Yu, T Zhang, K Song, Y Wang, S Gao - Information Sciences, 2024 - Elsevier

This study conducted a thorough evaluation of an improved dendritic learning (DL)
framework, focusing specifically on its application in power load forecasting. The objective …

Lưu Trích dẫn Trích dẫn 4 bài viết Bài viết có liên quan Tất cả 3 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Hierarchical emotion prediction and control in text-to-speech synthesis

S Inoue, K Zhou, S Wang, H Li - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org

It remains a challenge to effectively control the emotion rendering in text-to-speech (TTS)
synthesis. Prior studies have primarily focused on learning a global prosodic representation …

Lưu Trích dẫn Trích dẫn 7 bài viết Bài viết có liên quan Tất cả 3 phiên bản

Mdrt: Multi-domain synthetic speech localization

AKS Yadav, K Bhagtani, S Baireddy… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org

With recent advancements in generating synthetic speech, tools to generate high-quality
synthetic speech impersonating any human speaker are easily available. Several incidents …

Lưu Trích dẫn Trích dẫn 5 bài viết Bài viết có liên quan Tất cả 2 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Expressivity and speech synthesis

A Triantafyllopoulos, BW Schuller - arxiv preprint arxiv:2404.19363, 2024 - arxiv.org

Imbuing machines with the ability to talk has been a longtime pursuit of artificial intelligence
(AI) research. From the very beginning, the community has not only aimed to synthesise high …

Lưu Trích dẫn Trích dẫn 4 bài viết Bài viết có liên quan Tất cả 2 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Emotional dimension control in language model-based text-to-speech: Spanning a broad spectrum of human emotions

K Zhou, Y Zhang, S Zhao, H Wang, Z Pan, D Ng… - arxiv preprint arxiv …, 2024 - arxiv.org

Current emotional text-to-speech (TTS) systems face challenges in mimicking a broad
spectrum of human emotions due to the inherent complexity of emotions and limitations in …

Lưu Trích dẫn Trích dẫn 3 bài viết Bài viết có liên quan Tất cả 2 phiên bản Xem dạng HTML

Tạo thông báo

Trích dẫn

Tìm kiếm nâng cao

Đã lưu vào Thư viện của tôi

An overview of affective speech synthesis and conversion in the deep learning era

Deep learning-based expressive speech synthesis: a systematic review of approaches, challenges, and resources

Base tts: Lessons from building a billion-parameter text-to-speech model on 100k hours of data

Not my voice! a taxonomy of ethical and safety harms of speech generators

Slim: Style-linguistics mismatch model for generalized audio deepfake detection

Beyond Deep Learning: Charting the Next Frontiers of Affective Computing

Improved dendritic learning: Activation function analysis

Hierarchical emotion prediction and control in text-to-speech synthesis

Mdrt: Multi-domain synthetic speech localization

Expressivity and speech synthesis

Emotional dimension control in language model-based text-to-speech: Spanning a broad spectrum of human emotions