Emotional voice conversion: Theory, databases and esd

K Zhou, B Sisman, R Liu, H Li - Speech Communication, 2022 - Elsevier
In this paper, we first provide a review of the state-of-the-art emotional voice conversion
research, and the existing emotional speech databases. We then motivate the development …

[HTML][HTML] Video and audio deepfake datasets and open issues in deepfake technology: being ahead of the curve

Z Akhtar, TL Pendyala, VS Athmakuri - Forensic Sciences, 2024 - mdpi.com
The revolutionary breakthroughs in Machine Learning (ML) and Artificial Intelligence (AI) are
extensively being harnessed across a diverse range of domains, eg, forensic science …

Speech synthesis with mixed emotions

K Zhou, B Sisman, R Rana… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Emotional speech synthesis aims to synthesize human voices with various emotional effects.
The current studies are mostly focused on imitating an averaged style belonging to a specific …

Emotion intensity and its control for emotional voice conversion

K Zhou, B Sisman, R Rana… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Emotional voice conversion (EVC) seeks to convert the emotional state of an utterance while
preserving the linguistic content and speaker identity. In EVC, emotions are usually treated …

Textless speech emotion conversion using discrete and decomposed representations

F Kreuk, A Polyak, J Copet, E Kharitonov… - arxiv preprint arxiv …, 2021 - arxiv.org
Speech emotion conversion is the task of modifying the perceived emotion of a speech
utterance while preserving the lexical content and speaker identity. In this study, we cast the …

An overview & analysis of sequence-to-sequence emotional voice conversion

Z Yang, X **g, A Triantafyllopoulos, M Song… - arxiv preprint arxiv …, 2022 - arxiv.org
Emotional voice conversion (EVC) focuses on converting a speech utterance from a source
to a target emotion; it can thus be a key enabling technology for human-computer interaction …

Fusion of spectral and prosody modelling for multilingual speech emotion conversion

S Vekkot, D Gupta - Knowledge-Based Systems, 2022 - Elsevier
The paper proposes an integrated speech emotion conversion framework developed using
speaker-independent mixed-lingual training. The key contribution of the work is non-parallel …

Pavits: Exploring prosody-aware vits for end-to-end emotional voice conversion

T Qi, W Zheng, C Lu, Y Zong… - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org
In this paper, we propose Prosody-aware VITS (PAVITS) for emotional voice conversion
(EVC), aiming to achieve two major objectives of EVC: high content naturalness and high …

Nonparallel emotional voice conversion for unseen speaker-emotion pairs using dual domain adversarial network & virtual domain pairing

N Shah, M Singh, N Takahashi… - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
Primary goal of an emotional voice conversion (EVC) system is to convert the emotion of a
given speech signal from one style to another style without modifying the linguistic content of …

Emotional dimension control in language model-based text-to-speech: Spanning a broad spectrum of human emotions

K Zhou, Y Zhang, S Zhao, H Wang, Z Pan, D Ng… - arxiv preprint arxiv …, 2024 - arxiv.org
Current emotional text-to-speech (TTS) systems face challenges in mimicking a broad
spectrum of human emotions due to the inherent complexity of emotions and limitations in …