Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
An overview of affective speech synthesis and conversion in the deep learning era
Speech is the fundamental mode of human communication, and its synthesis has long been
a core priority in human–computer interaction research. In recent years, machines have …
a core priority in human–computer interaction research. In recent years, machines have …
Speech synthesis with mixed emotions
Emotional speech synthesis aims to synthesize human voices with various emotional effects.
The current studies are mostly focused on imitating an averaged style belonging to a specific …
The current studies are mostly focused on imitating an averaged style belonging to a specific …
Emotion rendering for conversational speech synthesis with heterogeneous graph-based context modeling
Conversational Speech Synthesis (CSS) aims to accurately express an utterance with the
appropriate prosody and emotional inflection within a conversational setting. While …
appropriate prosody and emotional inflection within a conversational setting. While …
Emodiff: Intensity controllable emotional text-to-speech with soft-label guidance
Although current neural text-to-speech (TTS) models are able to generate high-quality
speech, intensity controllable emotional TTS is still a challenging task. Most existing …
speech, intensity controllable emotional TTS is still a challenging task. Most existing …
An overview & analysis of sequence-to-sequence emotional voice conversion
Emotional voice conversion (EVC) focuses on converting a speech utterance from a source
to a target emotion; it can thus be a key enabling technology for human-computer interaction …
to a target emotion; it can thus be a key enabling technology for human-computer interaction …
Emomix: Emotion mixing via diffusion models for emotional speech synthesis
There has been significant progress in emotional Text-To-Speech (TTS) synthesis
technology in recent years. However, existing methods primarily focus on the synthesis of a …
technology in recent years. However, existing methods primarily focus on the synthesis of a …
Speech based suicide risk recognition for crisis intervention hotlines using explainable multi-task learning
Abstract Background Crisis Intervention Hotline can effectively reduce suicide risk, but suffer
from low connectivity rates and untimely crisis response. By integrating speech signals and …
from low connectivity rates and untimely crisis response. By integrating speech signals and …
Probing speech emotion recognition transformers for linguistic knowledge
Large, pre-trained neural networks consisting of self-attention layers (transformers) have
recently achieved state-of-the-art results on several speech emotion recognition (SER) …
recently achieved state-of-the-art results on several speech emotion recognition (SER) …
Disentanglement of emotional style and speaker identity for expressive voice conversion
Expressive voice conversion performs identity conversion for emotional speakers by jointly
converting speaker identity and emotional style. Due to the hierarchical structure of speech …
converting speaker identity and emotional style. Due to the hierarchical structure of speech …
Hierarchical emotion prediction and control in text-to-speech synthesis
It remains a challenge to effectively control the emotion rendering in text-to-speech (TTS)
synthesis. Prior studies have primarily focused on learning a global prosodic representation …
synthesis. Prior studies have primarily focused on learning a global prosodic representation …