An engineering view on emotions and speech: From analysis and predictive models to responsible human-centered applications

CC Lee, T Chaspari, EM Provost… - Proceedings of the …, 2023 - ieeexplore.ieee.org
The substantial growth of Internet-of-Things technology and the ubiquity of smartphone
devices has increased the public and industry focus on speech emotion recognition (SER) …

Deep neural networks for automatic speech processing: a survey from large corpora to limited data

V Roger, J Farinas, J Pinquier - EURASIP Journal on Audio, Speech, and …, 2022 - Springer
Most state-of-the-art speech systems use deep neural networks (DNNs). These systems
require a large amount of data to be learned. Hence, training state-of-the-art frameworks on …

Toward label-efficient emotion and sentiment analysis

S Zhao, X Hong, J Yang, Y Zhao… - Proceedings of the …, 2023 - ieeexplore.ieee.org
Emotion and sentiment play a central role in various human activities, such as perception,
decision-making, social interaction, and logical reasoning. Develo** artificial emotional …

Mmad: Multi-label micro-action detection in videos

K Li, D Guo, P Liu, G Chen, M Wang - arxiv preprint arxiv:2407.05311, 2024 - arxiv.org
Human body actions are an important form of non-verbal communication in social
interactions. This paper focuses on a specific subset of body actions known as micro …

Fusion of spectral and prosody modelling for multilingual speech emotion conversion

S Vekkot, D Gupta - Knowledge-Based Systems, 2022 - Elsevier
The paper proposes an integrated speech emotion conversion framework developed using
speaker-independent mixed-lingual training. The key contribution of the work is non-parallel …

Few-shot class-incremental audio classification via discriminative prototype learning

W **e, Y Li, Q He, W Cao - Expert Systems with Applications, 2023 - Elsevier
In real-world scenarios, new audio classes with insufficient samples usually emerge
continually, which motivates the study of few-shot class-incremental audio classification …

[HTML][HTML] Meltrans: Mel-spectrogram relationship-learning for speech emotion recognition via transformers

H Li, J Li, H Liu, T Liu, Q Chen, X You - Sensors, 2024 - mdpi.com
Speech emotion recognition (SER) is not only a ubiquitous aspect of everyday
communication, but also a central focus in the field of human–computer interaction …

Efficient labelling of affective video datasets via few-shot & multi-task contrastive learning

R Parameshwara, I Radwan, A Asthana… - Proceedings of the 31st …, 2023 - dl.acm.org
Whilst deep learning techniques have achieved excellent emotion prediction, they still
require large amounts of labelled training data, which are (a) onerous and tedious to …

Self-labeling with feature transfer for speech emotion recognition

G Wen, H Liao, H Li, P Wen, T Zhang, S Gao… - Knowledge-Based …, 2022 - Elsevier
Most speech emotion recognition methods based on frames have obtained good results in
many applications. However, they segment each speech sample into smaller frames that are …

MOCAS: A multimodal dataset for objective cognitive workload assessment on simultaneous tasks

W Jo, R Wang, GE Cha, S Sun… - IEEE Transactions …, 2024 - ieeexplore.ieee.org
This paper presents MOCAS, a multimodal dataset dedicated for human cognitive workload
(CWL) assessment. In contrast to existing datasets based on virtual game stimuli, the data in …