Emotion intensity and its control for emotional voice conversion

K Zhou, B Sisman, R Rana… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Emotional voice conversion (EVC) seeks to convert the emotional state of an utterance while
preserving the linguistic content and speaker identity. In EVC, emotions are usually treated …

Speech synthesis with mixed emotions

K Zhou, B Sisman, R Rana… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Emotional speech synthesis aims to synthesize human voices with various emotional effects.
The current studies are mostly focused on imitating an averaged style belonging to a specific …

Improving recognition-synthesis based any-to-one voice conversion with cyclic training

YN Chen, LJ Liu, YJ Hu, Y Jiang… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
In recognition-synthesis based any-to-one voice conversion (VC), an automatic speech
recognition (ASR) model is employed to extract content-related features and a synthesizer is …

Streaming non-autoregressive model for any-to-many voice conversion

Z Chen, H Miao, P Zhang - arxiv preprint arxiv:2206.07288, 2022 - arxiv.org
Voice conversion models have developed for decades, and current mainstream research
focuses on non-streaming voice conversion. However, streaming voice conversion is more …

Deep convolutional neural network for voice liveness detection

S Gupta, K Khoria, AT Patil… - 2021 Asia-Pacific Signal …, 2021 - ieeexplore.ieee.org
In this work, we present the system to detect the liveness by identifying the pop noise in the
voice signal in order to avoid the security breach of ASV systems. Pop noise is created due …

Emotion modelling for speech generation

K Zhou - 2023 - search.proquest.com
Speech generation aims to synthesize human-like voices from the input of text or speech.
Current speech generation techniques can generate high quality, natural-sounding speech …

[PDF][PDF] TVQVC: Transformer Based Vector Quantized Variational Autoencoder with CTC Loss for Voice Conversion.

Z Chen, P Zhang - Interspeech, 2021 - isca-archive.org
Techniques of voice conversion (VC) aim to modify the speaker identity and style of an
utterance while preserving the linguistic content. Although there are lots of VC methods, the …

VoiceGrad: Non-Parallel Any-to-Many Voice Conversion With Annealed Langevin Dynamics

H Kameoka, T Kaneko, K Tanaka… - … /ACM Transactions on …, 2024 - ieeexplore.ieee.org
In this paper, we propose a non-parallel any-to-many voice conversion (VC) method termed
VoiceGrad. Inspired by WaveGrad, a recently introduced novel waveform generation …