Cross-modal retrieval: a systematic review of methods and future directions

T Wang, F Li, L Zhu, J Li, Z Zhang… - Proceedings of the …, 2025 - ieeexplore.ieee.org
With the exponential surge in diverse multimodal data, traditional unimodal retrieval
methods struggle to meet the needs of users seeking access to data across various …

MedShapeNet – a large-scale dataset of 3D medical shapes for computer vision

J Li, Z Zhou, J Yang, A Pepe, C Gsaxner… - Biomedical …, 2024 - degruyter.com
Objectives The shape is commonly used to describe the objects. State-of-the-art algorithms
in medical imaging are predominantly diverging from computer vision, where voxel grids …

Cross-modal contrastive learning for domain adaptation in 3d semantic segmentation

B **ng, X Ying, R Wang, J Yang, T Chen - Proceedings of the AAAI …, 2023 - ojs.aaai.org
Abstract Domain adaptation for 3D point cloud has attracted a lot of interest since it can
avoid the time-consuming labeling process of 3D data to some extent. A recent work named …

Self-supervised auxiliary domain alignment for unsupervised 2D image-based 3D shape retrieval

AA Liu, C Zhang, W Li, X Gao, Z Sun… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Unsupervised 2D image-based 3D shape retrieval aims to match the similar 3D unlabeled
shapes when given a 2D labeled sample. Although a lot of methods have made a certain …

Comprehensive relationship reasoning for composed query based image retrieval

F Zhang, M Yan, J Zhang, C Xu - Proceedings of the 30th ACM …, 2022 - dl.acm.org
Composed Query Based Image Retrieval (CQBIR) aims at searching images relevant to a
composed query, ie, a reference image together with a modifier text. Compared with …

Looking 3D: Anomaly Detection with 2D-3D Alignment

A Bhunia, C Li, H Bilen - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
Automatic anomaly detection based on visual cues holds practical significance in various
domains such as manufacturing and product quality assessment. This paper introduces a …

T2TD: Text-3D generation model based on prior knowledge guidance

W Nie, R Chen, W Wang, B Lepri… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
In recent years, 3D models have been utilized in many applications, such as auto-drivers,
3D reconstruction, VR, and AR. However, the scarcity of 3D model data does not meet its …

Fine-grained Prototypical Voting with Heterogeneous Mixup for Semi-supervised 2D-3D Cross-modal Retrieval

F Zhang, XS Hua, C Chen… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
This paper studies the problem of semi-supervised 2D-3D retrieval which aims to align both
labeled and unlabeled 2D and 3D data into the same embedding space. The problem is …

RoMo: Robust Unsupervised Multimodal Learning With Noisy Pseudo Labels

Y Li, Y Qin, Y Sun, D Peng, X Peng… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
The rise of the metaverse and the increasing volume of heterogeneous 2D and 3D data
have created a growing demand for cross-modal retrieval, enabling users to query …

Unsupervised self-training correction learning for 2D image-based 3D model retrieval

Y Zhou, Y Liu, J **ao, M Liu, X Li, AA Liu - Information Processing & …, 2023 - Elsevier
Existing 2D image-based 3D model retrieval (IBMR) methods usually use the pseudo labels
as semantic guidance to reduce the domain-wise and class-wise feature distribution …