A review on generative adversarial networks: Algorithms, theory, and applications
Generative adversarial networks (GANs) have recently become a hot research topic;
however, they have been studied since 2014, and a large number of algorithms have been …
however, they have been studied since 2014, and a large number of algorithms have been …
An overview of voice conversion and its challenges: From statistical modeling to deep learning
Speaker identity is one of the important characteristics of human speech. In voice
conversion, we change the speaker identity from one to another, while kee** the linguistic …
conversion, we change the speaker identity from one to another, while kee** the linguistic …
Autovc: Zero-shot voice style transfer with only autoencoder loss
Despite the progress in voice conversion, many-to-many voice conversion trained on non-
parallel data, as well as zero-shot voice conversion, remains under-explored. Deep style …
parallel data, as well as zero-shot voice conversion, remains under-explored. Deep style …
Stargan-vc: Non-parallel many-to-many voice conversion using star generative adversarial networks
This paper proposes a method that allows non-parallel many-to-many voice conversion (VC)
by using a variant of a generative adversarial network (GAN) called StarGAN. Our method …
by using a variant of a generative adversarial network (GAN) called StarGAN. Our method …
Emotional voice conversion: Theory, databases and ESD
In this paper, we first provide a review of the state-of-the-art emotional voice conversion
research, and the existing emotional speech databases. We then motivate the development …
research, and the existing emotional speech databases. We then motivate the development …
Cyclegan-vc2: Improved cyclegan-based non-parallel voice conversion
Non-parallel voice conversion (VC) is a technique for learning the map** from source to
target speech without relying on parallel data. This is an important task, but it has been …
target speech without relying on parallel data. This is an important task, but it has been …
Cyclegan-vc: Non-parallel voice conversion using cycle-consistent adversarial networks
We propose a non-parallel voice-conversion (VC) method that can learn a map** from
source to target speech without relying on parallel data. The proposed method is particularly …
source to target speech without relying on parallel data. The proposed method is particularly …
Vqmivc: Vector quantization and mutual information-based unsupervised speech representation disentanglement for one-shot voice conversion
One-shot voice conversion (VC), which performs conversion across arbitrary speakers with
only a single target-speaker utterance for reference, can be effectively achieved by speech …
only a single target-speaker utterance for reference, can be effectively achieved by speech …
One-shot voice conversion by separating speaker and content representations with instance normalization
Recently, voice conversion (VC) without parallel data has been successfully adapted to multi-
target scenario in which a single model is trained to convert the input voice to many different …
target scenario in which a single model is trained to convert the input voice to many different …
Unsupervised speech decomposition via triple information bottleneck
Speech information can be roughly decomposed into four components: language content,
timbre, pitch, and rhythm. Obtaining disentangled representations of these components is …
timbre, pitch, and rhythm. Obtaining disentangled representations of these components is …