An overview of affective speech synthesis and conversion in the deep learning era
Speech is the fundamental mode of human communication, and its synthesis has long been
a core priority in human–computer interaction research. In recent years, machines have …
a core priority in human–computer interaction research. In recent years, machines have …
English emotional voice conversion using StarGAN model
The StarGANv2-VC model is a many-to-many non-parallel generative adversarial network
(GAN) voice conversion (VC) model that has proven effective in style conversion tasks. This …
(GAN) voice conversion (VC) model that has proven effective in style conversion tasks. This …
[PDF][PDF] E2E-S2S-VC: End-to-end sequence-to-sequence voice conversion
E2E-S2S-VC: End-to-end sequence-to-sequence voice conversion Page 1 E2E-S2S-VC:
End-to-end sequence-to-sequence voice conversion Takuma Okamoto1, Tomoki Toda2,1 …
End-to-end sequence-to-sequence voice conversion Takuma Okamoto1, Tomoki Toda2,1 …
Durflex-evc: Duration-flexible emotional voice conversion with parallel generation
Emotional voice conversion involves modifying the pitch, spectral envelope, and other
acoustic characteristics of speech to match a desired emotional state while maintaining the …
acoustic characteristics of speech to match a desired emotional state while maintaining the …
Audio-based Kinship Verification Using Age Domain Conversion
Audio-based kinship verification (AKV) is important in many domains, such as home security
monitoring, forensic identification, and social network analysis. A key challenge in the task …
monitoring, forensic identification, and social network analysis. A key challenge in the task …
Towards realistic emotional voice conversion using controllable emotional intensity
Realistic emotional voice conversion (EVC) aims to enhance emotional diversity of
converted audios, making the synthesized voices more authentic and natural. To this end …
converted audios, making the synthesized voices more authentic and natural. To this end …
EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical Vector
Emotional text-to-speech (TTS) technology has achieved significant progress in recent
years; however, challenges remain owing to the inherent complexity of emotions and …
years; however, challenges remain owing to the inherent complexity of emotions and …
[PDF][PDF] Scalability and diversity of StarGANv2-VC in Arabic emotional voice conversion: Overcoming data limitations and enhancing performance
AH Meftah, YA Alotaibi… - Journal of King Saud …, 2024 - researchgate.net
ABSTRACT Emotional Voice Conversion (EVC) for under-resourced languages like Arabic
faces challenges due to limited emotional speech data. This study explored strategies to …
faces challenges due to limited emotional speech data. This study explored strategies to …
[PDF][PDF] Mixed emotion modelling for emotional voice conversion
Emotional voice conversion (EVC) aims to convert the emotional state of an utterance from
one emotion to another while preserving the linguistic content and speaker identity. Current …
one emotion to another while preserving the linguistic content and speaker identity. Current …
[HTML][HTML] Scalability and diversity of StarGANv2-VC in Arabic emotional voice conversion: Arabic emotional voice conversion using StarGANv2-VC: Overcoming data …
Abstract Emotional Voice Conversion (EVC) for under-resourced languages like Arabic
faces challenges due to limited emotional speech data. This study explored strategies to …
faces challenges due to limited emotional speech data. This study explored strategies to …