Inclusive Deaf Education Enabled by Artificial Intelligence: The Path to a Solution
Deaf learners in the Global South struggle to access equitable education, in particular, there
are few instances where they can be facilitated in inclusive classrooms. The challenges …
are few instances where they can be facilitated in inclusive classrooms. The challenges …
A multi-speaker multi-lingual voice cloning system based on vits2 for limmits 2024 challenge
This paper presents the development of a speech synthesis system for the LIMMITS'24
Challenge, focusing primarily on Track 2. The objective of the challenge is to establish a …
Challenge, focusing primarily on Track 2. The objective of the challenge is to establish a …
Generalized Fake Audio Detection via Deep Stable Learning
Although current fake audio detection approaches have achieved remarkable success on
specific datasets, they often fail when evaluated with datasets from different distributions …
specific datasets, they often fail when evaluated with datasets from different distributions …
Genuine-Focused Learning using Mask AutoEncoder for Generalized Fake Audio Detection
The generalization of Fake Audio Detection (FAD) is critical due to the emergence of new
spoofing techniques. Traditional FAD methods often focus solely on distinguishing between …
spoofing techniques. Traditional FAD methods often focus solely on distinguishing between …
PPPR: Portable Plug-in Prompt Refiner for Text to Audio Generation
Text-to-Audio (TTA) aims to generate audio that corresponds to the given text description,
playing a crucial role in media production. The text descriptions in TTA datasets lack rich …
playing a crucial role in media production. The text descriptions in TTA datasets lack rich …
A Noval Feature via Color Quantisation for Fake Audio Detection
In the field of deepfake detection, previous studies focus on using reconstruction or mask
and prediction methods to train pre-trained models, which are then transferred to fake audio …
and prediction methods to train pre-trained models, which are then transferred to fake audio …
[PDF][PDF] Dynamic Soft Windowing and Language Dependent Style Token for Code-Switching End-to-End Speech Synthesis.
Most of current end-to-end speech synthesis assumes the input text is in a single language
situation. However, codeswitching in speech occurs frequently in routine life, in which …
situation. However, codeswitching in speech occurs frequently in routine life, in which …
Bi-level style and prosody decoupling modeling for personalized end-to-end speech synthesis
End-to-end framework can generate high-quality and high-similarity speech in the
personalized speech synthesis task. However, the generalization of out-of-domain texts is …
personalized speech synthesis task. However, the generalization of out-of-domain texts is …
Culturally Aware Intelligent Learning Environments for Resource-Poor Countries
This paper presents current work being done on the development of a speech and language
technology (SLT) based intelligent tutoring system for coaching young learners from a …
technology (SLT) based intelligent tutoring system for coaching young learners from a …
[PDF][PDF] Expressive text-to-speech using style representations
LH Ueda - repositorio.unicamp.br
Artificial speech has been present in our lives in several aspects. From automatic messages
when you didn't answer your phone years ago to movies that portray robots capable of …
when you didn't answer your phone years ago to movies that portray robots capable of …