A survey of multi-modal knowledge graphs: Technologies and trends

W Liang, PD Meo, Y Tang, J Zhu - ACM Computing Surveys, 2024 - dl.acm.org
In recent years, Knowledge Graphs (KGs) have played a crucial role in the development of
advanced knowledge-intensive applications, such as recommender systems and semantic …

Learning with noisy correspondence for cross-modal matching

Z Huang, G Niu, X Liu, W Ding… - Advances in Neural …, 2021 - proceedings.neurips.cc
Cross-modal matching, which aims to establish the correspondence between two different
modalities, is fundamental to a variety of tasks such as cross-modal retrieval and vision-and …

Cross modal retrieval with querybank normalisation

SV Bogolin, I Croitoru, H **, Y Liu… - Proceedings of the …, 2022 - openaccess.thecvf.com
Profiting from large-scale training datasets, advances in neural architecture design and
efficient inference, joint embeddings have become the dominant approach for tackling cross …

Visual pivoting for (unsupervised) entity alignment

F Liu, M Chen, D Roth, N Collier - … of the AAAI conference on artificial …, 2021 - ojs.aaai.org
This work studies the use of visual semantic representations to align entities in
heterogeneous knowledge graphs (KGs). Images are natural components of many existing …

Coder: Coupled diversity-sensitive momentum contrastive learning for image-text retrieval

H Wang, D He, W Wu, B **a, M Yang, F Li, Y Yu… - … on Computer Vision, 2022 - Springer
Abstract Image-Text Retrieval (ITR) is challenging in bridging visual and lingual modalities.
Contrastive learning has been adopted by most prior arts. Except for limited amount of …

DFMKE: A dual fusion multi-modal knowledge graph embedding framework for entity alignment

J Zhu, C Huang, P De Meo - Information Fusion, 2023 - Elsevier
Entity alignment is critical for multiple knowledge graphs (KGs) integration. Although
researchers have made significant efforts to explore the relational embeddings between …

Sgaligner: 3d scene alignment with scene graphs

SD Sarkar, O Miksik, M Pollefeys… - Proceedings of the …, 2023 - openaccess.thecvf.com
Building 3D scene graphs has recently emerged as a topic in scene representation for
several embodied AI applications to represent the world in a structured and rich manner …

Adaptive offline quintuplet loss for image-text matching

T Chen, J Deng, J Luo - Computer Vision–ECCV 2020: 16th European …, 2020 - Springer
Existing image-text matching approaches typically leverage triplet loss with online hard
negatives to train the model. For each image or text anchor in a training mini-batch, the …

Gssf: Generalized structural sparse function for deep cross-modal metric learning

H Diao, Y Zhang, S Gao, J Zhu… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Cross-modal metric learning is a prominent research topic that bridges the semantic
heterogeneity between vision and language. Existing methods frequently utilize simple …

Balance act: Mitigating hubness in cross-modal retrieval with query and gallery banks

Y Wang, X Jian, B Xue - arxiv preprint arxiv:2310.11612, 2023 - arxiv.org
In this work, we present a post-processing solution to address the hubness problem in cross-
modal retrieval, a phenomenon where a small number of gallery data points are frequently …