Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Speech synthesis with mixed emotions
Emotional speech synthesis aims to synthesize human voices with various emotional effects.
The current studies are mostly focused on imitating an averaged style belonging to a specific …
The current studies are mostly focused on imitating an averaged style belonging to a specific …
Emotion intensity and its control for emotional voice conversion
Emotional voice conversion (EVC) seeks to convert the emotional state of an utterance while
preserving the linguistic content and speaker identity. In EVC, emotions are usually treated …
preserving the linguistic content and speaker identity. In EVC, emotions are usually treated …
Visinger 2: High-fidelity end-to-end singing voice synthesis enhanced by digital signal processing synthesizer
End-to-end singing voice synthesis (SVS) model VISinger can achieve better performance
than the typical two-stage model with fewer parameters. However, VISinger has several …
than the typical two-stage model with fewer parameters. However, VISinger has several …
Styletts-vc: One-shot voice conversion by knowledge transfer from style-based tts models
One-shot voice conversion (VC) aims to convert speech from any source speaker to an
arbitrary target speaker with only a few seconds of reference speech from the target speaker …
arbitrary target speaker with only a few seconds of reference speech from the target speaker …
Converting foreign accent speech without a reference
Foreign accent conversion (FAC) is the problem of generating a synthetic voice that has the
voice identity of a second-language (L2) learner and the pronunciation patterns of a native …
voice identity of a second-language (L2) learner and the pronunciation patterns of a native …
Prompt-singer: Controllable singing-voice-synthesis with natural language prompt
Recent singing-voice-synthesis (SVS) methods have achieved remarkable audio quality and
naturalness, yet they lack the capability to control the style attributes of the synthesized …
naturalness, yet they lack the capability to control the style attributes of the synthesized …
[PDF][PDF] Data augmentation for children ASR and child-adult speaker classification using voice conversion methods
Many young children prefer speech based interfaces over text, as they are relatively slow
and error-prone with text input. However, children ASR can be challenging due to the lack of …
and error-prone with text input. However, children ASR can be challenging due to the lack of …
Acoustic tracking of pitch, modal, and subharmonic vibrations of vocal folds in Parkinson's disease and parkinsonism
The prominent and early presence of dysphonia is considered a valuable marker for
differentiation of idiopathic Parkinson's disease and parkinsonian syndromes. Objective …
differentiation of idiopathic Parkinson's disease and parkinsonian syndromes. Objective …
[PDF][PDF] Speech synthesis from articulatory movements recorded by real-time MRI
Y Otani, S Sawada, H Ohmura, K Katsurada - Proc. Interspeech, 2023 - isca-archive.org
Previous speech synthesis models from articulatory movements recorded using real-time
MRI (rtMRI) only predicted vocal tract shape parameters and required additional pitch …
MRI (rtMRI) only predicted vocal tract shape parameters and required additional pitch …
A comparative study of voice conversion models with large-scale speech and singing data: The T13 systems for the singing voice conversion challenge 2023
This paper presents our systems (denoted as T13) for the singing voice conversion
challenge (SVCC) 2023. For both in-domain and cross-domain English singing voice …
challenge (SVCC) 2023. For both in-domain and cross-domain English singing voice …