An engineering view on emotions and speech: From analysis and predictive models to responsible human-centered applications
The substantial growth of Internet-of-Things technology and the ubiquity of smartphone
devices has increased the public and industry focus on speech emotion recognition (SER) …
devices has increased the public and industry focus on speech emotion recognition (SER) …
Deep neural networks for automatic speech processing: a survey from large corpora to limited data
Most state-of-the-art speech systems use deep neural networks (DNNs). These systems
require a large amount of data to be learned. Hence, training state-of-the-art frameworks on …
require a large amount of data to be learned. Hence, training state-of-the-art frameworks on …
Toward label-efficient emotion and sentiment analysis
Emotion and sentiment play a central role in various human activities, such as perception,
decision-making, social interaction, and logical reasoning. Develo** artificial emotional …
decision-making, social interaction, and logical reasoning. Develo** artificial emotional …
Mmad: Multi-label micro-action detection in videos
Human body actions are an important form of non-verbal communication in social
interactions. This paper focuses on a specific subset of body actions known as micro …
interactions. This paper focuses on a specific subset of body actions known as micro …
Fusion of spectral and prosody modelling for multilingual speech emotion conversion
S Vekkot, D Gupta - Knowledge-Based Systems, 2022 - Elsevier
The paper proposes an integrated speech emotion conversion framework developed using
speaker-independent mixed-lingual training. The key contribution of the work is non-parallel …
speaker-independent mixed-lingual training. The key contribution of the work is non-parallel …
Few-shot class-incremental audio classification via discriminative prototype learning
In real-world scenarios, new audio classes with insufficient samples usually emerge
continually, which motivates the study of few-shot class-incremental audio classification …
continually, which motivates the study of few-shot class-incremental audio classification …
[HTML][HTML] Meltrans: Mel-spectrogram relationship-learning for speech emotion recognition via transformers
H Li, J Li, H Liu, T Liu, Q Chen, X You - Sensors, 2024 - mdpi.com
Speech emotion recognition (SER) is not only a ubiquitous aspect of everyday
communication, but also a central focus in the field of human–computer interaction …
communication, but also a central focus in the field of human–computer interaction …
Efficient labelling of affective video datasets via few-shot & multi-task contrastive learning
Whilst deep learning techniques have achieved excellent emotion prediction, they still
require large amounts of labelled training data, which are (a) onerous and tedious to …
require large amounts of labelled training data, which are (a) onerous and tedious to …
Self-labeling with feature transfer for speech emotion recognition
Most speech emotion recognition methods based on frames have obtained good results in
many applications. However, they segment each speech sample into smaller frames that are …
many applications. However, they segment each speech sample into smaller frames that are …
MOCAS: A multimodal dataset for objective cognitive workload assessment on simultaneous tasks
This paper presents MOCAS, a multimodal dataset dedicated for human cognitive workload
(CWL) assessment. In contrast to existing datasets based on virtual game stimuli, the data in …
(CWL) assessment. In contrast to existing datasets based on virtual game stimuli, the data in …