Multimodal fusion on low-quality data: A comprehensive survey

Q Zhang, Y Wei, Z Han, H Fu, X Peng, C Deng… - arxiv preprint arxiv …, 2024 - arxiv.org
Multimodal fusion focuses on integrating information from multiple modalities with the goal of
more accurate prediction, which has achieved remarkable progress in a wide range of …

Madtp: Multimodal alignment-guided dynamic token pruning for accelerating vision-language transformer

J Cao, P Ye, S Li, C Yu, Y Tang… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract Vision-Language Transformers (VLTs) have shown great success recently but are
meanwhile accompanied by heavy computation costs where a major reason can be …

C2kd: Bridging the modality gap for cross-modal knowledge distillation

F Huo, W Xu, J Guo, H Wang… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Abstract Existing Knowledge Distillation (KD) methods typically focus on transferring
knowledge from a large-capacity teacher to a low-capacity student model achieving …

Suppress and rebalance: Towards generalized multi-modal face anti-spoofing

X Lin, S Wang, R Cai, Y Liu, Y Fu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract Face Anti-Spoofing (FAS) is crucial for securing face recognition systems against
presentation attacks. With advancements in sensor manufacture and multi-modal learning …

A variational expectation-maximization framework for balanced multi-scale learning of protein and drug interactions

J Rao, J **e, Q Yuan, D Liu, Z Wang, Y Lu… - Nature …, 2024 - nature.com
Protein functions are characterized by interactions with proteins, drugs, and other
biomolecules. Understanding these interactions is essential for deciphering the molecular …

Multimodal representation learning by alternating unimodal adaptation

X Zhang, J Yoon, M Bansal… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Multimodal learning which integrates data from diverse sensory modes plays a pivotal role
in artificial intelligence. However existing multimodal learning methods often struggle with …

Facilitating multimodal classification via dynamically learning modality gap

Y Yang, F Wan, QY Jiang, Y Xu - Advances in Neural …, 2025 - proceedings.neurips.cc
Multimodal learning falls into the trap of the optimization dilemma due to the modality
imbalance phenomenon, leading to unsatisfactory performance in real applications. A core …

Embracing unimodal aleatoric uncertainty for robust multimodal fusion

Z Gao, X Jiang, X Xu, F Shen, Y Li… - Proceedings of the …, 2024 - openaccess.thecvf.com
As a fundamental problem in multimodal learning multimodal fusion aims to compensate for
the inherent limitations of a single modality. One challenge of multimodal fusion is that the …

Enhancing multimodal cooperation via sample-level modality valuation

Y Wei, R Feng, Z Wang, D Hu - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
One primary topic of multimodal learning is to jointly incorporate heterogeneous information
from different modalities. However most models often suffer from unsatisfactory multimodal …

Test-time adaptation against multi-modal reliability bias

M Yang, Y Li, C Zhang, P Hu, X Peng - The Twelfth International …, 2024 - openreview.net
Test-time adaptation (TTA) has emerged as a new paradigm for reconciling distribution shifts
across domains without accessing source data. However, existing TTA methods mainly …