Multimodal machine learning: A survey and taxonomy
Our experience of the world is multimodal-we see objects, hear sounds, feel texture, smell
odors, and taste flavors. Modality refers to the way in which something happens or is …
odors, and taste flavors. Modality refers to the way in which something happens or is …
A survey on integrated sensing, communication, and computation
The forthcoming generation of wireless technology, 6G, promises a revolutionary leap
beyond traditional data-centric services. It aims to usher in an era of ubiquitous intelligent …
beyond traditional data-centric services. It aims to usher in an era of ubiquitous intelligent …
Exploiting multi-cnn features in cnn-rnn based dimensional emotion recognition on the omg in-the-wild dataset
This article presents a novel CNN-RNN based approach, which exploits multiple CNN
features for dimensional emotion recognition in-the-wild, utilizing the One-Minute Gradual …
features for dimensional emotion recognition in-the-wild, utilizing the One-Minute Gradual …
Learn to combine modalities in multimodal deep learning
Combining complementary information from multiple modalities is intuitively appealing for
improving the performance of learning-based approaches. However, it is challenging to fully …
improving the performance of learning-based approaches. However, it is challenging to fully …
M2lens: Visualizing and explaining multimodal models for sentiment analysis
Multimodal sentiment analysis aims to recognize people's attitudes from multiple
communication channels such as verbal content (ie, text), voice, and facial expressions. It …
communication channels such as verbal content (ie, text), voice, and facial expressions. It …
Neural multimodal cooperative learning toward micro-video understanding
The prevailing characteristics of micro-videos result in the less descriptive power of each
modality. The micro-video representations, several pioneer efforts proposed, are limited in …
modality. The micro-video representations, several pioneer efforts proposed, are limited in …
Learning in audio-visual context: A review, analysis, and new perspective
Sight and hearing are two senses that play a vital role in human communication and scene
understanding. To mimic human perception ability, audio-visual learning, aimed at …
understanding. To mimic human perception ability, audio-visual learning, aimed at …
LSTM-modeling of continuous emotions in an audiovisual affect recognition framework
Automatically recognizing human emotions from spontaneous and non-prototypical real-life
data is currently one of the most challenging tasks in the field of affective computing. This …
data is currently one of the most challenging tasks in the field of affective computing. This …
Multimodal categorization of crisis events in social media
Recent developments in image classification and natural language processing, coupled with
the rapid growth in social media usage, have enabled fundamental advances in detecting …
the rapid growth in social media usage, have enabled fundamental advances in detecting …
Survey on audiovisual emotion recognition: databases, features, and data fusion strategies
Emotion recognition is the ability to identify what people would think someone is feeling from
moment to moment and understand the connection between his/her feelings and …
moment to moment and understand the connection between his/her feelings and …