Google Академія

B Sisman, J Yamagishi, S King… - IEEE/ACM Transactions …, 2020 - ieeexplore.ieee.org

Speaker identity is one of the important characteristics of human speech. In voice
conversion, we change the speaker identity from one to another, while kee** the linguistic …

Зберегти Послатися Цитовано в 419 джерелах Пов’язані статті Кількість версій: 9

[Free GPT-4]
[DeepSeek]

[PDF] sciencedirect.com

Emotional voice conversion: Theory, databases and esd

K Zhou, B Sisman, R Liu, H Li - Speech Communication, 2022 - Elsevier

In this paper, we first provide a review of the state-of-the-art emotional voice conversion
research, and the existing emotional speech databases. We then motivate the development …

Зберегти Послатися Цитовано в 192 джерелах Пов’язані статті Кількість версій: 7

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Seen and unseen emotional style transfer for voice conversion with a new emotional speech dataset

K Zhou, B Sisman, R Liu, H Li - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org

Emotional voice conversion aims to transform emotional prosody in speech while preserving
the linguistic content and speaker identity. Prior studies show that it is possible to …

Зберегти Послатися Цитовано в 241 джерелах Пов’язані статті Кількість версій: 8

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Expressive TTS training with frame and style reconstruction loss

R Liu, B Sisman, G Gao, H Li - IEEE/ACM Transactions on …, 2021 - ieeexplore.ieee.org

We propose a novel training strategy for Tacotron-based text-to-speech (TTS) system that
improves the speech styling at utterance level. One of the key challenges in prosody …

Зберегти Послатися Цитовано в 93 джерелах Пов’язані статті Кількість версій: 4

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Transforming spectrum and prosody for emotional voice conversion with non-parallel training data

K Zhou, B Sisman, H Li - arxiv preprint arxiv:2002.00198, 2020 - arxiv.org

Emotional voice conversion aims to convert the spectrum and prosody to change the
emotional patterns of speech, while preserving the speaker identity and linguistic content …

Зберегти Послатися Цитовано в 89 джерелах Пов’язані статті Кількість версій: 6 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Transfer learning from speech synthesis to voice conversion with non-parallel training data

M Zhang, Y Zhou, L Zhao, H Li - IEEE/ACM Transactions on …, 2021 - ieeexplore.ieee.org

We present a novel voice conversion (VC) framework by learning from a text-to-speech
(TTS) synthesis system, that is called TTS-VC transfer learning or TTL-VC for short. We first …

Зберегти Послатися Цитовано в 68 джерелах Пов’язані статті Кількість версій: 5

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Converting anyone's emotion: Towards speaker-independent emotional voice conversion

K Zhou, B Sisman, M Zhang, H Li - arxiv preprint arxiv:2005.07025, 2020 - arxiv.org

Emotional voice conversion aims to convert the emotion of speech from one state to another
while preserving the linguistic content and speaker identity. The prior studies on emotional …

Зберегти Послатися Цитовано в 67 джерелах Пов’язані статті Кількість версій: 8 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] sjtu.edu.cn

Modified magnitude-phase spectrum information for spoofing detection

J Yang, H Wang, RK Das, Y Qian - IEEE/ACM Transactions on …, 2021 - ieeexplore.ieee.org

Most of the existing feature representations for spoofing countermeasures consider
information either from the magnitude or phase spectrum. We hypothesize that both …

Зберегти Послатися Цитовано в 46 джерелах Пов’язані статті Кількість версій: 4

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Teacher-student training for robust tacotron-based tts

R Liu, B Sisman, J Li, F Bao, G Gao… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org

While neural end-to-end text-to-speech (TTS) is superior to conventional statistical methods
in many ways, the exposure bias problem in the autoregressive models remains an issue to …

Зберегти Послатися Цитовано в 67 джерелах Пов’язані статті Кількість версій: 7

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Limited data emotional voice conversion leveraging text-to-speech: Two-stage sequence-to-sequence training

K Zhou, B Sisman, H Li - arxiv preprint arxiv:2103.16809, 2021 - arxiv.org

Emotional voice conversion (EVC) aims to change the emotional state of an utterance while
preserving the linguistic content and speaker identity. In this paper, we propose a novel 2 …

Зберегти Послатися Цитовано в 43 джерелах Пов’язані статті Кількість версій: 6 Показати у форматі HTML

Створити сповіщення

Послатися

Розширений пошук

Збережено в моїй бібліотеці

Group sparse representation with wavenet vocoder adaptation for spectrum and prosody conversion

An overview of voice conversion and its challenges: From statistical modeling to deep learning

Emotional voice conversion: Theory, databases and esd

Seen and unseen emotional style transfer for voice conversion with a new emotional speech dataset

Expressive TTS training with frame and style reconstruction loss

Transforming spectrum and prosody for emotional voice conversion with non-parallel training data

Transfer learning from speech synthesis to voice conversion with non-parallel training data

Converting anyone's emotion: Towards speaker-independent emotional voice conversion

Modified magnitude-phase spectrum information for spoofing detection

Teacher-student training for robust tacotron-based tts

Limited data emotional voice conversion leveraging text-to-speech: Two-stage sequence-to-sequence training