A survey of deep learning-based multimodal emotion recognition: Speech, text, and face
Multimodal emotion recognition (MER) refers to the identification and understanding of
human emotional states by combining different signals, including—but not limited to—text …
human emotional states by combining different signals, including—but not limited to—text …
A compact embedding for facial expression similarity
Most of the existing work on automatic facial expression analysis focuses on discrete
emotion recognition, or facial action unit detection. However, facial expressions do not …
emotion recognition, or facial action unit detection. However, facial expressions do not …
Unsupervised representation learning for gaze estimation
Although automatic gaze estimation is very important to a large variety of application areas,
it is difficult to train accurate and robust gaze models, in great part due to the difficulty in …
it is difficult to train accurate and robust gaze models, in great part due to the difficulty in …
Keypointdeformer: Unsupervised 3d keypoint discovery for shape control
We introduce KeypointDeformer, a novel unsupervised method for shape control through
automatically discovered 3D keypoints. We cast this as the problem of aligning a source 3D …
automatically discovered 3D keypoints. We cast this as the problem of aligning a source 3D …
SelfME: Self-supervised motion learning for micro-expression recognition
Facial micro-expressions (MEs) refer to brief spontaneous facial movements that can reveal
a person's genuine emotion. They are valuable in lie detection, criminal analysis, and other …
a person's genuine emotion. They are valuable in lie detection, criminal analysis, and other …
Pose-disentangled contrastive learning for self-supervised facial representation
Self-supervised facial representation has recently attracted increasing attention due to its
ability to perform face understanding without relying on large-scale annotated datasets …
ability to perform face understanding without relying on large-scale annotated datasets …
[PDF][PDF] Self-Supervised Learning for Facial Action Unit Recognition through Temporal Consistency.
Facial expressions have inherent temporal dependencies that can be leveraged in
automatic facial expression analysis from videos. In this paper, we propose a self …
automatic facial expression analysis from videos. In this paper, we propose a self …
Inter-intra modal representation augmentation with trimodal collaborative disentanglement network for multimodal sentiment analysis
Recently, Multimodal Sentiment Analysis (MSA) is a challenging research area given its
complex nature, and humans express emotional cues across various modalities such as …
complex nature, and humans express emotional cues across various modalities such as …
Weakly supervised regional and temporal learning for facial action unit recognition
Automatic facial action unit (AU) recognition is a challenging task due to the scarcity of
manual annotations. To alleviate this problem, a large amount of efforts has been dedicated …
manual annotations. To alleviate this problem, a large amount of efforts has been dedicated …
Families in wild multimedia: A multimodal database for recognizing kinship
Kinship, a soft biometric detectable in media, is fundamental for a myriad of use-cases.
Despite the difficulty of detecting kinship, annual data challenges using still-images have …
Despite the difficulty of detecting kinship, annual data challenges using still-images have …