- Academic Search

A Coy, PS Mohammed, P Skerrit - International Journal of Artificial …, 2024 - Springer

Deaf learners in the Global South struggle to access equitable education, in particular, there
are few instances where they can be facilitated in inclusive classrooms. The challenges …

保存引用被引用数: 5 関連記事

[Free GPT-4]

[PDF] arxiv.org

A multi-speaker multi-lingual voice cloning system based on vits2 for limmits 2024 challenge

X Wang, Y Lu, X Qi, Z Wang, Y **e, S Shi… - arxiv preprint arxiv …, 2024 - arxiv.org

This paper presents the development of a speech synthesis system for the LIMMITS'24
Challenge, focusing primarily on Track 2. The objective of the challenge is to establish a …

保存引用被引用数: 1 関連記事 HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Generalized Fake Audio Detection via Deep Stable Learning

Z Wang, R Fu, Z Wen, Y **e, Y Liu, X Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

Although current fake audio detection approaches have achieved remarkable success on
specific datasets, they often fail when evaluated with datasets from different distributions …

保存引用被引用数: 7 関連記事全 2 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Genuine-Focused Learning using Mask AutoEncoder for Generalized Fake Audio Detection

X Wang, R Fu, Z Wen, Z Wang, Y **e, Y Liu… - arxiv preprint arxiv …, 2024 - arxiv.org

The generalization of Fake Audio Detection (FAD) is critical due to the emergence of new
spoofing techniques. Traditional FAD methods often focus solely on distinguishing between …

保存引用被引用数: 6 関連記事全 2 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

PPPR: Portable Plug-in Prompt Refiner for Text to Audio Generation

S Shi, R Fu, Z Wen, J Tao, T Wang, C Qiang… - arxiv preprint arxiv …, 2024 - arxiv.org

Text-to-Audio (TTA) aims to generate audio that corresponds to the given text description,
playing a crucial role in media production. The text descriptions in TTA datasets lack rich …

保存引用関連記事全 2 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

A Noval Feature via Color Quantisation for Fake Audio Detection

Z Wang, X Wang, Y **e, R Fu, Z Wen… - 2024 IEEE 14th …, 2024 - ieeexplore.ieee.org

In the field of deepfake detection, previous studies focus on using reconstruction or mask
and prediction methods to train pre-trained models, which are then transferred to fake audio …

保存引用関連記事全 3 バージョン

[Free GPT-4]

[PDF] isca-archive.org

[PDF][PDF] Dynamic Soft Windowing and Language Dependent Style Token for Code-Switching End-to-End Speech Synthesis.

R Fu, J Tao, Z Wen, J Yi, C Qiang, T Wang - INTERSPEECH, 2020 - isca-archive.org

Most of current end-to-end speech synthesis assumes the input text is in a single language
situation. However, codeswitching in speech occurs frequently in routine life, in which …

保存引用被引用数: 7 関連記事全 6 バージョン HTMLバージョン

[Free GPT-4]

[PDF] researchgate.net

Bi-level style and prosody decoupling modeling for personalized end-to-end speech synthesis

R Fu, J Tao, Z Wen, J Yi, T Wang… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org

End-to-end framework can generate high-quality and high-similarity speech in the
personalized speech synthesis task. However, the generalization of out-of-domain texts is …

保存引用被引用数: 2 関連記事全 2 バージョン

Culturally Aware Intelligent Learning Environments for Resource-Poor Countries

PS Mohammed, A Coy - International Conference on Human-Computer …, 2021 - Springer

This paper presents current work being done on the development of a speech and language
technology (SLT) based intelligent tutoring system for coaching young learners from a …

保存引用被引用数: 3 関連記事全 2 バージョン

[Free GPT-4]

[PDF] unicamp.br

[PDF][PDF] Expressive text-to-speech using style representations

LH Ueda - repositorio.unicamp.br

Artificial speech has been present in our lives in several aspects. From automatic messages
when you didn't answer your phone years ago to movies that portray robots capable of …

保存引用関連記事全 2 バージョン HTMLバージョン

アラートを作成

引用

検索オプション

マイライブラリに保存しました

Focusing on attention: prosody transfer and adaptative optimization strategy for multi-speaker...

Inclusive Deaf Education Enabled by Artificial Intelligence: The Path to a Solution

A multi-speaker multi-lingual voice cloning system based on vits2 for limmits 2024 challenge

Generalized Fake Audio Detection via Deep Stable Learning

Genuine-Focused Learning using Mask AutoEncoder for Generalized Fake Audio Detection

PPPR: Portable Plug-in Prompt Refiner for Text to Audio Generation

A Noval Feature via Color Quantisation for Fake Audio Detection

[PDF][PDF] Dynamic Soft Windowing and Language Dependent Style Token for Code-Switching End-to-End Speech Synthesis.

Bi-level style and prosody decoupling modeling for personalized end-to-end speech synthesis

Culturally Aware Intelligent Learning Environments for Resource-Poor Countries

[PDF][PDF] Expressive text-to-speech using style representations