Using transformers for multimodal emotion recognition: Taxonomies and state of the art review
Emotion recognition is an aspect of human-computer interaction, affective computing, and
social robotics. Conventional unimodal approaches for emotion recognition, depending on …
social robotics. Conventional unimodal approaches for emotion recognition, depending on …
Vesper: A compact and effective pretrained model for speech emotion recognition
This paper presents a paradigm that adapts general large-scale pretrained models (PTMs)
to speech emotion recognition task. Although PTMs shed new light on artificial general …
to speech emotion recognition task. Although PTMs shed new light on artificial general …
Multimodal emotion recognition using cross modal audio-video fusion with attention and deep metric learning
In the last few years, the multi-modal emotion recognition has become an important research
issue in the affective computing community due to its wide range of applications that include …
issue in the affective computing community due to its wide range of applications that include …
Dreamtalk: When expressive talking head generation meets diffusion probabilistic models
Diffusion models have shown remarkable success in a variety of downstream generative
tasks, yet remain under-explored in the important and challenging expressive talking head …
tasks, yet remain under-explored in the important and challenging expressive talking head …
Speechformer++: A hierarchical efficient framework for paralinguistic speech processing
Paralinguistic speech processing is important in addressing many issues, such as sentiment
and neurocognitive disorder analyses. Recently, Transformer has achieved remarkable …
and neurocognitive disorder analyses. Recently, Transformer has achieved remarkable …
Cross-language speech emotion recognition using multimodal dual attention transformers
Despite the recent progress in speech emotion recognition (SER), state-of-the-art systems
are unable to achieve improved performance in cross-language settings. In this paper, we …
are unable to achieve improved performance in cross-language settings. In this paper, we …
Knowledge-aware bayesian co-attention for multimodal emotion recognition
Multimodal emotion recognition is a challenging research area that aims to fuse different
modalities to predict human emotion. However, most existing models that are based on …
modalities to predict human emotion. However, most existing models that are based on …
Multi-scale temporal transformer for speech emotion recognition
Z Li, X **ng, Y Fang, W Zhang, H Fan, X Xu - arxiv preprint arxiv …, 2024 - arxiv.org
Speech emotion recognition plays a crucial role in human-machine interaction systems.
Recently various optimized Transformers have been successfully applied to speech emotion …
Recently various optimized Transformers have been successfully applied to speech emotion …
Dst: Deformable speech transformer for emotion recognition
Enabled by multi-head self-attention, Transformer has exhibited remarkable results in
speech emotion recognition (SER). Compared to the original full attention mechanism …
speech emotion recognition (SER). Compared to the original full attention mechanism …
A multi-stage hierarchical relational graph neural network for multimodal sentiment analysis
P Gong, J Liu, X Zhang, X Li - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
Multimodal sentiment analysis targets at accurately perceiving the emotional states by
incorporating related information from multiple sources. However, existing methods mostly …
incorporating related information from multiple sources. However, existing methods mostly …