An overview of voice conversion and its challenges: From statistical modeling to deep learning
Speaker identity is one of the important characteristics of human speech. In voice
conversion, we change the speaker identity from one to another, while kee** the linguistic …
conversion, we change the speaker identity from one to another, while kee** the linguistic …
Generative adversarial networks for speech processing: A review
Generative adversarial networks (GANs) have seen remarkable progress in recent years.
They are used as generative models for all kinds of data such as text, images, audio, music …
They are used as generative models for all kinds of data such as text, images, audio, music …
Stargan-vc: Non-parallel many-to-many voice conversion using star generative adversarial networks
This paper proposes a method that allows non-parallel many-to-many voice conversion (VC)
by using a variant of a generative adversarial network (GAN) called StarGAN. Our method …
by using a variant of a generative adversarial network (GAN) called StarGAN. Our method …
Cyclegan-vc2: Improved cyclegan-based non-parallel voice conversion
Non-parallel voice conversion (VC) is a technique for learning the map** from source to
target speech without relying on parallel data. This is an important task, but it has been …
target speech without relying on parallel data. This is an important task, but it has been …
Cyclegan-vc: Non-parallel voice conversion using cycle-consistent adversarial networks
We propose a non-parallel voice-conversion (VC) method that can learn a map** from
source to target speech without relying on parallel data. The proposed method is particularly …
source to target speech without relying on parallel data. The proposed method is particularly …
Parallel-data-free voice conversion using cycle-consistent adversarial networks
Time-frequency masking-based speech enhancement using generative adversarial network
The success of time-frequency (TF) mask-based approaches is dependent on the accuracy
of predicted mask given the noisy spectral features. The state-of-the-art methods in TF …
of predicted mask given the noisy spectral features. The state-of-the-art methods in TF …
Stargan-vc2: Rethinking conditional methods for stargan-based voice conversion
ACVAE-VC: Non-parallel voice conversion with auxiliary classifier variational autoencoder
This paper proposes a non-parallel voice conversion (VC) method using a variant of the
conditional variational autoencoder (VAE) called an auxiliary classifier VAE. The proposed …
conditional variational autoencoder (VAE) called an auxiliary classifier VAE. The proposed …
Sequence-to-sequence acoustic modeling for voice conversion
In this paper, a neural network named sequence-to-sequence ConvErsion NeTwork
(SCENT) is presented for acoustic modeling in voice conversion. At training stage, a SCENT …
(SCENT) is presented for acoustic modeling in voice conversion. At training stage, a SCENT …