Foundations & trends in multimodal machine learning: Principles, challenges, and open questions
Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …
computer agents with intelligent capabilities such as understanding, reasoning, and learning …
Multimodal research in vision and language: A review of current and emerging trends
Deep Learning and its applications have cascaded impactful research and development
with a diverse range of modalities present in the real-world data. More recently, this has …
with a diverse range of modalities present in the real-world data. More recently, this has …
Misa: Modality-invariant and-specific representations for multimodal sentiment analysis
Multimodal Sentiment Analysis is an active area of research that leverages multimodal
signals for affective understanding of user-generated videos. The predominant approach …
signals for affective understanding of user-generated videos. The predominant approach …
Disentangled representation learning for multimodal emotion recognition
Multimodal emotion recognition aims to identify human emotions from text, audio, and visual
modalities. Previous methods either explore correlations between different modalities or …
modalities. Previous methods either explore correlations between different modalities or …
Bi-bimodal modality fusion for correlation-controlled multimodal sentiment analysis
Multimodal sentiment analysis aims to extract and integrate semantic information collected
from multiple modalities to recognize the expressed emotions and sentiment in multimodal …
from multiple modalities to recognize the expressed emotions and sentiment in multimodal …
[HTML][HTML] Multibench: Multiscale benchmarks for multimodal representation learning
Learning multimodal representations involves integrating information from multiple
heterogeneous sources of data. It is a challenging yet crucial area with numerous real-world …
heterogeneous sources of data. It is a challenging yet crucial area with numerous real-world …
Beneath the tip of the iceberg: Current challenges and new directions in sentiment analysis research
Sentiment analysis as a field has come a long way since it was first introduced as a task
nearly 20 years ago. It has widespread commercial applications in various domains like …
nearly 20 years ago. It has widespread commercial applications in various domains like …
A low-rank matching attention based cross-modal feature fusion method for conversational emotion recognition
Conversational emotion recognition (CER) is an important research topic in human-
computer interactions. Although recent advancements in transformer-based cross-modal …
computer interactions. Although recent advancements in transformer-based cross-modal …
Xmecap: Meme caption generation with sub-image adaptability
Humor, deeply rooted in societal meanings and cultural details, poses a unique challenge
for machines. While advances have been made in natural language processing, real-world …
for machines. While advances have been made in natural language processing, real-world …
Quantifying & modeling multimodal interactions: An information decomposition framework
The recent explosion of interest in multimodal applications has resulted in a wide selection
of datasets and methods for representing and integrating information from different …
of datasets and methods for representing and integrating information from different …