- Academic Search

K Zhou, B Sisman, R Rana… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Emotional speech synthesis aims to synthesize human voices with various emotional effects.
The current studies are mostly focused on imitating an averaged style belonging to a specific …

Uložit Citovat Počet citací tohoto článku: 61 Související články Všechny verze (počet: 7)

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Emotion intensity and its control for emotional voice conversion

K Zhou, B Sisman, R Rana… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Emotional voice conversion (EVC) seeks to convert the emotional state of an utterance while
preserving the linguistic content and speaker identity. In EVC, emotions are usually treated …

Uložit Citovat Počet citací tohoto článku: 65 Související články Všechny verze (počet: 8)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Visinger 2: High-fidelity end-to-end singing voice synthesis enhanced by digital signal processing synthesizer

Y Zhang, H Xue, H Li, L **e, T Guo, R Zhang… - arxiv preprint arxiv …, 2022 - arxiv.org

End-to-end singing voice synthesis (SVS) model VISinger can achieve better performance
than the typical two-stage model with fewer parameters. However, VISinger has several …

Uložit Citovat Počet citací tohoto článku: 28 Související články Všechny verze (počet: 4) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] nih.gov

Styletts-vc: One-shot voice conversion by knowledge transfer from style-based tts models

YA Li, C Han, N Mesgarani - 2022 IEEE Spoken Language …, 2023 - ieeexplore.ieee.org

One-shot voice conversion (VC) aims to convert speech from any source speaker to an
arbitrary target speaker with only a few seconds of reference speech from the target speaker …

Uložit Citovat Počet citací tohoto článku: 20 Související články Všechny verze (počet: 6)

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Converting foreign accent speech without a reference

G Zhao, S Ding… - IEEE/ACM Transactions on …, 2021 - ieeexplore.ieee.org

Foreign accent conversion (FAC) is the problem of generating a synthetic voice that has the
voice identity of a second-language (L2) learner and the pronunciation patterns of a native …

Uložit Citovat Počet citací tohoto článku: 32 Související články Všechny verze (počet: 4)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Prompt-singer: Controllable singing-voice-synthesis with natural language prompt

Y Wang, R Hu, R Huang, Z Hong, R Li, W Liu… - arxiv preprint arxiv …, 2024 - arxiv.org

Recent singing-voice-synthesis (SVS) methods have achieved remarkable audio quality and
naturalness, yet they lack the capability to control the style attributes of the synthesized …

Uložit Citovat Počet citací tohoto článku: 4 Související články Všechny verze (počet: 6) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] isca-archive.org

[PDF][PDF] Data augmentation for children ASR and child-adult speaker classification using voice conversion methods

Z Shuyang, M Singh, A Woubie, R Karhila - Proc. Interspeech, 2023 - isca-archive.org

Many young children prefer speech based interfaces over text, as they are relatively slow
and error-prone with text input. However, children ASR can be challenging due to the lack of …

Uložit Citovat Počet citací tohoto článku: 11 Související články Všechny verze (počet: 3) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Acoustic tracking of pitch, modal, and subharmonic vibrations of vocal folds in Parkinson's disease and parkinsonism

J Hlavnička, R Čmejla, J Klempíř, E Růžička… - IEEE Access, 2019 - ieeexplore.ieee.org

The prominent and early presence of dysphonia is considered a valuable marker for
differentiation of idiopathic Parkinson's disease and parkinsonian syndromes. Objective …

Uložit Citovat Počet citací tohoto článku: 43 Související články Všechny verze (počet: 3)

[Free GPT-4]
[DeepSeek]

[PDF] isca-archive.org

[PDF][PDF] Speech synthesis from articulatory movements recorded by real-time MRI

Y Otani, S Sawada, H Ohmura, K Katsurada - Proc. Interspeech, 2023 - isca-archive.org

Previous speech synthesis models from articulatory movements recorded using real-time
MRI (rtMRI) only predicted vocal tract shape parameters and required additional pitch …

Uložit Citovat Počet citací tohoto článku: 10 Související články Všechny verze (počet: 5) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A comparative study of voice conversion models with large-scale speech and singing data: The T13 systems for the singing voice conversion challenge 2023

R Yamamoto, R Yoneyama, LP Violeta… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org

This paper presents our systems (denoted as T13) for the singing voice conversion
challenge (SVCC) 2023. For both in-domain and cross-domain English singing voice …

Uložit Citovat Počet citací tohoto článku: 8 Související články Všechny verze (počet: 4)

Vytvořit upozornění

Citovat

Rozšířené vyhledávání

Uloženo do Mojí knihovny

Harvest: A High-Performance Fundamental Frequency Estimator from Speech Signals.

Speech synthesis with mixed emotions

Emotion intensity and its control for emotional voice conversion

Visinger 2: High-fidelity end-to-end singing voice synthesis enhanced by digital signal processing synthesizer

Styletts-vc: One-shot voice conversion by knowledge transfer from style-based tts models

Converting foreign accent speech without a reference

Prompt-singer: Controllable singing-voice-synthesis with natural language prompt

[PDF][PDF] Data augmentation for children ASR and child-adult speaker classification using voice conversion methods

Acoustic tracking of pitch, modal, and subharmonic vibrations of vocal folds in Parkinson's disease and parkinsonism

[PDF][PDF] Speech synthesis from articulatory movements recorded by real-time MRI

A comparative study of voice conversion models with large-scale speech and singing data: The T13 systems for the singing voice conversion challenge 2023