An engineering view on emotions and speech: From analysis and predictive models to responsible human-centered applications
The substantial growth of Internet-of-Things technology and the ubiquity of smartphone
devices has increased the public and industry focus on speech emotion recognition (SER) …
devices has increased the public and industry focus on speech emotion recognition (SER) …
Open-Emotion: A Reproducible EMO-Superb For Speech Emotion Recognition Systems
Speech emotion recognition (SER) is an essential technology for human-computer
interaction systems. However, the previous study reveals that 80.77% of SER papers yield …
interaction systems. However, the previous study reveals that 80.77% of SER papers yield …
EMO-SUPERB: An in-depth look at speech emotion recognition
Speech emotion recognition (SER) is a pivotal technology for human-computer interaction
systems. However, 80.77% of SER papers yield results that cannot be reproduced. We …
systems. However, 80.77% of SER papers yield results that cannot be reproduced. We …
Estimating the uncertainty in emotion attributes using deep evidential regression
In automatic emotion recognition (AER), labels assigned by different human annotators to
the same utterance are often inconsistent due to the inherent complexity of emotion and the …
the same utterance are often inconsistent due to the inherent complexity of emotion and the …
Exploiting co-occurrence frequency of emotions in perceptual evaluations to train a speech emotion classifier
Previous studies on speech emotion recognition (SER) with categorical emotions have often
formulated the task as a single-label classification problem, where the emotions are …
formulated the task as a single-label classification problem, where the emotions are …
[HTML][HTML] Deep temporal clustering features for speech emotion recognition
Deep clustering is a popular unsupervised technique for feature representation learning. We
recently proposed the chunk-based DeepEmoCluster framework for speech emotion …
recently proposed the chunk-based DeepEmoCluster framework for speech emotion …
Learning With Rater-Expanded Label Space to Improve Speech Emotion Recognition
Automatic sensing of emotional information in speech is important for numerous everyday
applications. Conventional Speech Emotion Recognition (SER) models rely on averaging or …
applications. Conventional Speech Emotion Recognition (SER) models rely on averaging or …
Disentangling prosody representations with unsupervised speech reconstruction
Human speech can be characterized by different components, including semantic content,
speaker identity and prosodic information. Significant progress has been made in …
speaker identity and prosodic information. Significant progress has been made in …
Extending speech emotion recognition systems to non-prototypical emotions using mixed-emotion model
In the conventional approach to speech emotion recognition (SER), the classifier is usually
trained on acted emotional speech data to predict individual basic emotions. In this work, we …
trained on acted emotional speech data to predict individual basic emotions. In this work, we …
Subjective evaluation of basic emotions from audio–visual data
Understanding of the perception of emotions or affective states in humans is important to
develop emotion-aware systems that work in realistic scenarios. In this paper, the perception …
develop emotion-aware systems that work in realistic scenarios. In this paper, the perception …