A survey of multi-modal knowledge graphs: Technologies and trends
In recent years, Knowledge Graphs (KGs) have played a crucial role in the development of
advanced knowledge-intensive applications, such as recommender systems and semantic …
advanced knowledge-intensive applications, such as recommender systems and semantic …
Learning with noisy correspondence for cross-modal matching
Cross-modal matching, which aims to establish the correspondence between two different
modalities, is fundamental to a variety of tasks such as cross-modal retrieval and vision-and …
modalities, is fundamental to a variety of tasks such as cross-modal retrieval and vision-and …
Cross modal retrieval with querybank normalisation
Profiting from large-scale training datasets, advances in neural architecture design and
efficient inference, joint embeddings have become the dominant approach for tackling cross …
efficient inference, joint embeddings have become the dominant approach for tackling cross …
Visual pivoting for (unsupervised) entity alignment
This work studies the use of visual semantic representations to align entities in
heterogeneous knowledge graphs (KGs). Images are natural components of many existing …
heterogeneous knowledge graphs (KGs). Images are natural components of many existing …
Coder: Coupled diversity-sensitive momentum contrastive learning for image-text retrieval
Abstract Image-Text Retrieval (ITR) is challenging in bridging visual and lingual modalities.
Contrastive learning has been adopted by most prior arts. Except for limited amount of …
Contrastive learning has been adopted by most prior arts. Except for limited amount of …
DFMKE: A dual fusion multi-modal knowledge graph embedding framework for entity alignment
Entity alignment is critical for multiple knowledge graphs (KGs) integration. Although
researchers have made significant efforts to explore the relational embeddings between …
researchers have made significant efforts to explore the relational embeddings between …
Sgaligner: 3d scene alignment with scene graphs
Building 3D scene graphs has recently emerged as a topic in scene representation for
several embodied AI applications to represent the world in a structured and rich manner …
several embodied AI applications to represent the world in a structured and rich manner …
Adaptive offline quintuplet loss for image-text matching
Existing image-text matching approaches typically leverage triplet loss with online hard
negatives to train the model. For each image or text anchor in a training mini-batch, the …
negatives to train the model. For each image or text anchor in a training mini-batch, the …
Gssf: Generalized structural sparse function for deep cross-modal metric learning
Cross-modal metric learning is a prominent research topic that bridges the semantic
heterogeneity between vision and language. Existing methods frequently utilize simple …
heterogeneity between vision and language. Existing methods frequently utilize simple …
Balance act: Mitigating hubness in cross-modal retrieval with query and gallery banks
In this work, we present a post-processing solution to address the hubness problem in cross-
modal retrieval, a phenomenon where a small number of gallery data points are frequently …
modal retrieval, a phenomenon where a small number of gallery data points are frequently …