Deepfakes generation and detection: State-of-the-art, open challenges, countermeasures, and way forward

M Masood, M Nawaz, KM Malik, A Javed, A Irtaza… - Applied …, 2023 - Springer
Easy access to audio-visual content on social media, combined with the availability of
modern tools such as Tensorflow or Keras, and open-source trained models, along with …

Audio deepfake detection: A survey

J Yi, C Wang, J Tao, X Zhang, CY Zhang… - arxiv preprint arxiv …, 2023 - arxiv.org
Audio deepfake detection is an emerging active topic. A growing number of literatures have
aimed to study deepfake detection algorithms and achieved effective performance, the …

A survey on neural speech synthesis

X Tan, T Qin, F Soong, TY Liu - arxiv preprint arxiv:2106.15561, 2021 - arxiv.org
Text to speech (TTS), or speech synthesis, which aims to synthesize intelligible and natural
speech given text, is a hot research topic in speech, language, and machine learning …

Virtual-reality interpromotion technology for metaverse: A survey

D Wu, Z Yang, P Zhang, R Wang… - IEEE Internet of Things …, 2023 - ieeexplore.ieee.org
The metaverse aims to build an immersive virtual reality world to support the daily life, work,
and recreation of people. In this survey, the status quo of the metaverse is investigated, and …

Add 2023: the second audio deepfake detection challenge

J Yi, J Tao, R Fu, X Yan, C Wang, T Wang… - arxiv preprint arxiv …, 2023 - arxiv.org
Audio deepfake detection is an emerging topic in the artificial intelligence community. The
second Audio Deepfake Detection Challenge (ADD 2023) aims to spur researchers around …

Emotional voice conversion: Theory, databases and esd

K Zhou, B Sisman, R Liu, H Li - Speech Communication, 2022 - Elsevier
In this paper, we first provide a review of the state-of-the-art emotional voice conversion
research, and the existing emotional speech databases. We then motivate the development …

Seen and unseen emotional style transfer for voice conversion with a new emotional speech dataset

K Zhou, B Sisman, R Liu, H Li - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
Emotional voice conversion aims to transform emotional prosody in speech while preserving
the linguistic content and speaker identity. Prior studies show that it is possible to …

Voice conversion challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion

Y Zhao, WC Huang, X Tian, J Yamagishi… - arxiv preprint arxiv …, 2020 - arxiv.org
The voice conversion challenge is a bi-annual scientific event held to compare and
understand different voice conversion (VC) systems built on a common dataset. In 2020, we …

Conventional and contemporary approaches used in text to speech synthesis: A review

N Kaur, P Singh - Artificial Intelligence Review, 2023 - Springer
Nowadays speech synthesis or text to speech (TTS), an ability of system to produce human
like natural sounding voice from the written text, is gaining popularity in the field of speech …

An overview of affective speech synthesis and conversion in the deep learning era

A Triantafyllopoulos, BW Schuller… - Proceedings of the …, 2023 - ieeexplore.ieee.org
Speech is the fundamental mode of human communication, and its synthesis has long been
a core priority in human–computer interaction research. In recent years, machines have …