Adversarial alignment and graph fusion via information bottleneck for multimodal emotion recognition in conversations

Y Shou, T Meng, W Ai, F Zhang, N Yin, K Li - Information Fusion, 2024 - Elsevier
With the rapid development of social media and human–computer interaction, multimodal
emotion recognition in conversations (MERC) tasks have begun to receive widespread …

Adversarial representation with intra-modal and inter-modal graph contrastive learning for multimodal emotion recognition

Y Shou, T Meng, W Ai, N Yin, K Li - arxiv preprint arxiv:2312.16778, 2023 - arxiv.org
With the release of increasing open-source emotion recognition datasets on social media
platforms and the rapid development of computing resources, multimodal emotion …

Variational causal inference network for explanatory visual question answering

D Xue, S Qian, C Xu - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Abstract Explanatory Visual Question Answering (EVQA) is a recently proposed multimodal
reasoning task that requires answering visual questions and generating multimodal …

LDRE: LLM-based Divergent Reasoning and Ensemble for Zero-Shot Composed Image Retrieval

Z Yang, D Xue, S Qian, W Dong, C Xu - Proceedings of the 47th …, 2024 - dl.acm.org
Zero-Shot Composed Image Retrieval (ZS-CIR) has garnered increasing interest in recent
years, which aims to retrieve a target image based on a query composed of a reference …

Multi-level contrastive learning: Hierarchical alleviation of heterogeneity in multimodal sentiment analysis

C Fan, K Zhu, J Tao, G Yi, J Xue… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Recently, multimodal fusion efforts have achieved remarkable success in Multimodal
Sentiment Analysis (MSA). However, most of the existing methods are based on model-level …

A survey on cross-media search based on user intention understanding in social networks

L Shi, J Luo, C Zhu, F Kou, G Cheng, X Liu - Information Fusion, 2023 - Elsevier
With the increasing popularity of online social networks, more and more people are posting
information, updating their statuses, and searching for topics there. Massive cross-media big …

Adversarial Graph Neural Network for Multivariate Time Series Anomaly Detection

B Zheng, L Ming, K Zeng, M Zhou… - … on Knowledge and …, 2024 - ieeexplore.ieee.org
Anomaly detection is one of the most significant tasks in multivariate time series analysis,
while it remains challenging to model complex patterns for improving detection accuracy …

Open-world social event classification

S Qian, H Chen, D Xue, Q Fang, C Xu - Proceedings of the ACM Web …, 2023 - dl.acm.org
With the rapid development of Internet and the expanding scale of social media, social event
classification has attracted increasing attention. The key to social event classification is …

EduCross: Dual adversarial bipartite hypergraph learning for cross-modal retrieval in multimodal educational slides

M Li, S Zhou, Y Chen, C Huang, Y Jiang - Information Fusion, 2024 - Elsevier
In the digital education landscape, cross-modal retrieval (CMR) from multimodal educational
slides represents a significant challenge, particularly because of the complex nature of …

Fine-grained Prototypical Voting with Heterogeneous Mixup for Semi-supervised 2D-3D Cross-modal Retrieval

F Zhang, XS Hua, C Chen… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
This paper studies the problem of semi-supervised 2D-3D retrieval which aims to align both
labeled and unlabeled 2D and 3D data into the same embedding space. The problem is …