Deepfakes generation and detection: State-of-the-art, open challenges, countermeasures, and way forward
Easy access to audio-visual content on social media, combined with the availability of
modern tools such as Tensorflow or Keras, and open-source trained models, along with …
modern tools such as Tensorflow or Keras, and open-source trained models, along with …
An overview of voice conversion and its challenges: From statistical modeling to deep learning
Speaker identity is one of the important characteristics of human speech. In voice
conversion, we change the speaker identity from one to another, while kee** the linguistic …
conversion, we change the speaker identity from one to another, while kee** the linguistic …
Seen and unseen emotional style transfer for voice conversion with a new emotional speech dataset
Emotional voice conversion aims to transform emotional prosody in speech while preserving
the linguistic content and speaker identity. Prior studies show that it is possible to …
the linguistic content and speaker identity. Prior studies show that it is possible to …
Stargan-vc: Non-parallel many-to-many voice conversion using star generative adversarial networks
This paper proposes a method that allows non-parallel many-to-many voice conversion (VC)
by using a variant of a generative adversarial network (GAN) called StarGAN. Our method …
by using a variant of a generative adversarial network (GAN) called StarGAN. Our method …
Biomimetic and flexible piezoelectric mobile acoustic sensors with multiresonant ultrathin structures for machine learning biometrics
Flexible resonant acoustic sensors have attracted substantial attention as an essential
component for intuitive human-machine interaction (HMI) in the future voice user interface …
component for intuitive human-machine interaction (HMI) in the future voice user interface …
Emotional voice conversion: Theory, databases and ESD
In this paper, we first provide a review of the state-of-the-art emotional voice conversion
research, and the existing emotional speech databases. We then motivate the development …
research, and the existing emotional speech databases. We then motivate the development …
Cyclegan-vc2: Improved cyclegan-based non-parallel voice conversion
Non-parallel voice conversion (VC) is a technique for learning the map** from source to
target speech without relying on parallel data. This is an important task, but it has been …
target speech without relying on parallel data. This is an important task, but it has been …
Cyclegan-vc: Non-parallel voice conversion using cycle-consistent adversarial networks
We propose a non-parallel voice-conversion (VC) method that can learn a map** from
source to target speech without relying on parallel data. The proposed method is particularly …
source to target speech without relying on parallel data. The proposed method is particularly …
The voice conversion challenge 2018: Promoting development of parallel and nonparallel methods
We present the Voice Conversion Challenge 2018, designed as a follow up to the 2016
edition with the aim of providing a common framework for evaluating and comparing …
edition with the aim of providing a common framework for evaluating and comparing …
Vqmivc: Vector quantization and mutual information-based unsupervised speech representation disentanglement for one-shot voice conversion
One-shot voice conversion (VC), which performs conversion across arbitrary speakers with
only a single target-speaker utterance for reference, can be effectively achieved by speech …
only a single target-speaker utterance for reference, can be effectively achieved by speech …