Inclusive Deaf Education Enabled by Artificial Intelligence: The Path to a Solution

A Coy, PS Mohammed, P Skerrit - International Journal of Artificial …, 2024 - Springer
Deaf learners in the Global South struggle to access equitable education, in particular, there
are few instances where they can be facilitated in inclusive classrooms. The challenges …

A multi-speaker multi-lingual voice cloning system based on vits2 for limmits 2024 challenge

X Wang, Y Lu, X Qi, Z Wang, Y **e, S Shi… - arxiv preprint arxiv …, 2024 - arxiv.org
This paper presents the development of a speech synthesis system for the LIMMITS'24
Challenge, focusing primarily on Track 2. The objective of the challenge is to establish a …

Generalized Fake Audio Detection via Deep Stable Learning

Z Wang, R Fu, Z Wen, Y **e, Y Liu, X Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
Although current fake audio detection approaches have achieved remarkable success on
specific datasets, they often fail when evaluated with datasets from different distributions …

Genuine-Focused Learning using Mask AutoEncoder for Generalized Fake Audio Detection

X Wang, R Fu, Z Wen, Z Wang, Y **e, Y Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
The generalization of Fake Audio Detection (FAD) is critical due to the emergence of new
spoofing techniques. Traditional FAD methods often focus solely on distinguishing between …

PPPR: Portable Plug-in Prompt Refiner for Text to Audio Generation

S Shi, R Fu, Z Wen, J Tao, T Wang, C Qiang… - arxiv preprint arxiv …, 2024 - arxiv.org
Text-to-Audio (TTA) aims to generate audio that corresponds to the given text description,
playing a crucial role in media production. The text descriptions in TTA datasets lack rich …

A Noval Feature via Color Quantisation for Fake Audio Detection

Z Wang, X Wang, Y **e, R Fu, Z Wen… - 2024 IEEE 14th …, 2024 - ieeexplore.ieee.org
In the field of deepfake detection, previous studies focus on using reconstruction or mask
and prediction methods to train pre-trained models, which are then transferred to fake audio …

[PDF][PDF] Dynamic Soft Windowing and Language Dependent Style Token for Code-Switching End-to-End Speech Synthesis.

R Fu, J Tao, Z Wen, J Yi, C Qiang, T Wang - INTERSPEECH, 2020 - isca-archive.org
Most of current end-to-end speech synthesis assumes the input text is in a single language
situation. However, codeswitching in speech occurs frequently in routine life, in which …

Bi-level style and prosody decoupling modeling for personalized end-to-end speech synthesis

R Fu, J Tao, Z Wen, J Yi, T Wang… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
End-to-end framework can generate high-quality and high-similarity speech in the
personalized speech synthesis task. However, the generalization of out-of-domain texts is …

Culturally Aware Intelligent Learning Environments for Resource-Poor Countries

PS Mohammed, A Coy - International Conference on Human-Computer …, 2021 - Springer
This paper presents current work being done on the development of a speech and language
technology (SLT) based intelligent tutoring system for coaching young learners from a …

[PDF][PDF] Expressive text-to-speech using style representations

LH Ueda - repositorio.unicamp.br
Artificial speech has been present in our lives in several aspects. From automatic messages
when you didn't answer your phone years ago to movies that portray robots capable of …