Cross-modal retrieval: a systematic review of methods and future directions
With the exponential surge in diverse multimodal data, traditional unimodal retrieval
methods struggle to meet the needs of users seeking access to data across various …
methods struggle to meet the needs of users seeking access to data across various …
MedShapeNet – a large-scale dataset of 3D medical shapes for computer vision
Objectives The shape is commonly used to describe the objects. State-of-the-art algorithms
in medical imaging are predominantly diverging from computer vision, where voxel grids …
in medical imaging are predominantly diverging from computer vision, where voxel grids …
Cross-modal contrastive learning for domain adaptation in 3d semantic segmentation
Abstract Domain adaptation for 3D point cloud has attracted a lot of interest since it can
avoid the time-consuming labeling process of 3D data to some extent. A recent work named …
avoid the time-consuming labeling process of 3D data to some extent. A recent work named …
Self-supervised auxiliary domain alignment for unsupervised 2D image-based 3D shape retrieval
AA Liu, C Zhang, W Li, X Gao, Z Sun… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Unsupervised 2D image-based 3D shape retrieval aims to match the similar 3D unlabeled
shapes when given a 2D labeled sample. Although a lot of methods have made a certain …
shapes when given a 2D labeled sample. Although a lot of methods have made a certain …
Comprehensive relationship reasoning for composed query based image retrieval
Composed Query Based Image Retrieval (CQBIR) aims at searching images relevant to a
composed query, ie, a reference image together with a modifier text. Compared with …
composed query, ie, a reference image together with a modifier text. Compared with …
Looking 3D: Anomaly Detection with 2D-3D Alignment
Automatic anomaly detection based on visual cues holds practical significance in various
domains such as manufacturing and product quality assessment. This paper introduces a …
domains such as manufacturing and product quality assessment. This paper introduces a …
T2TD: Text-3D generation model based on prior knowledge guidance
In recent years, 3D models have been utilized in many applications, such as auto-drivers,
3D reconstruction, VR, and AR. However, the scarcity of 3D model data does not meet its …
3D reconstruction, VR, and AR. However, the scarcity of 3D model data does not meet its …
Fine-grained Prototypical Voting with Heterogeneous Mixup for Semi-supervised 2D-3D Cross-modal Retrieval
This paper studies the problem of semi-supervised 2D-3D retrieval which aims to align both
labeled and unlabeled 2D and 3D data into the same embedding space. The problem is …
labeled and unlabeled 2D and 3D data into the same embedding space. The problem is …
RoMo: Robust Unsupervised Multimodal Learning With Noisy Pseudo Labels
The rise of the metaverse and the increasing volume of heterogeneous 2D and 3D data
have created a growing demand for cross-modal retrieval, enabling users to query …
have created a growing demand for cross-modal retrieval, enabling users to query …
Unsupervised self-training correction learning for 2D image-based 3D model retrieval
Y Zhou, Y Liu, J **ao, M Liu, X Li, AA Liu - Information Processing & …, 2023 - Elsevier
Existing 2D image-based 3D model retrieval (IBMR) methods usually use the pseudo labels
as semantic guidance to reduce the domain-wise and class-wise feature distribution …
as semantic guidance to reduce the domain-wise and class-wise feature distribution …