Multimodal machine learning: A survey and taxonomy

T Baltrušaitis, C Ahuja… - IEEE transactions on …, 2018 - ieeexplore.ieee.org
Our experience of the world is multimodal-we see objects, hear sounds, feel texture, smell
odors, and taste flavors. Modality refers to the way in which something happens or is …

A survey on integrated sensing, communication, and computation

D Wen, Y Zhou, X Li, Y Shi, K Huang… - … Surveys & Tutorials, 2024 - ieeexplore.ieee.org
The forthcoming generation of wireless technology, 6G, promises a revolutionary leap
beyond traditional data-centric services. It aims to usher in an era of ubiquitous intelligent …

Exploiting multi-cnn features in cnn-rnn based dimensional emotion recognition on the omg in-the-wild dataset

D Kollias, S Zafeiriou - IEEE Transactions on Affective …, 2020 - ieeexplore.ieee.org
This article presents a novel CNN-RNN based approach, which exploits multiple CNN
features for dimensional emotion recognition in-the-wild, utilizing the One-Minute Gradual …

Learn to combine modalities in multimodal deep learning

K Liu, Y Li, N Xu, P Natarajan - arxiv preprint arxiv:1805.11730, 2018 - arxiv.org
Combining complementary information from multiple modalities is intuitively appealing for
improving the performance of learning-based approaches. However, it is challenging to fully …

M2lens: Visualizing and explaining multimodal models for sentiment analysis

X Wang, J He, Z **, M Yang, Y Wang… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Multimodal sentiment analysis aims to recognize people's attitudes from multiple
communication channels such as verbal content (ie, text), voice, and facial expressions. It …

Neural multimodal cooperative learning toward micro-video understanding

Y Wei, X Wang, W Guan, L Nie, Z Lin… - IEEE Transactions on …, 2019 - ieeexplore.ieee.org
The prevailing characteristics of micro-videos result in the less descriptive power of each
modality. The micro-video representations, several pioneer efforts proposed, are limited in …

Learning in audio-visual context: A review, analysis, and new perspective

Y Wei, D Hu, Y Tian, X Li - arxiv preprint arxiv:2208.09579, 2022 - arxiv.org
Sight and hearing are two senses that play a vital role in human communication and scene
understanding. To mimic human perception ability, audio-visual learning, aimed at …

LSTM-modeling of continuous emotions in an audiovisual affect recognition framework

M Wöllmer, M Kaiser, F Eyben, B Schuller… - Image and Vision …, 2013 - Elsevier
Automatically recognizing human emotions from spontaneous and non-prototypical real-life
data is currently one of the most challenging tasks in the field of affective computing. This …

Multimodal categorization of crisis events in social media

M Abavisani, L Wu, S Hu… - Proceedings of the …, 2020 - openaccess.thecvf.com
Recent developments in image classification and natural language processing, coupled with
the rapid growth in social media usage, have enabled fundamental advances in detecting …

Survey on audiovisual emotion recognition: databases, features, and data fusion strategies

CH Wu, JC Lin, WL Wei - APSIPA transactions on signal and …, 2014 - cambridge.org
Emotion recognition is the ability to identify what people would think someone is feeling from
moment to moment and understand the connection between his/her feelings and …