Cross-modal retrieval: a systematic review of methods and future directions
With the exponential surge in diverse multimodal data, traditional unimodal retrieval
methods struggle to meet the needs of users seeking access to data across various …
methods struggle to meet the needs of users seeking access to data across various …
Cross-modal active complementary learning with self-refining correspondence
Recently, image-text matching has attracted more and more attention from academia and
industry, which is fundamental to understanding the latent correspondence across visual …
industry, which is fundamental to understanding the latent correspondence across visual …
Robust multi-view clustering with noisy correspondence
Deep multi-view clustering leverages deep neural networks to achieve promising
performance, but almost all existing methods implicitly assume that all views are aligned …
performance, but almost all existing methods implicitly assume that all views are aligned …
Noisy-correspondence learning for text-to-image person re-identification
Text-to-image person re-identification (TIReID) is a compelling topic in the cross-modal
community which aims to retrieve the target person based on a textual query. Although …
community which aims to retrieve the target person based on a textual query. Although …
Robust object re-identification with coupled noisy labels
In this paper, we reveal and study a new challenging problem faced by object Re-
IDentification (ReID), ie, Coupled Noisy Labels (CNL) which refers to the Noisy Annotation …
IDentification (ReID), ie, Coupled Noisy Labels (CNL) which refers to the Noisy Annotation …
Breaking through the noisy correspondence: A robust model for image-text matching
Unleashing the power of image-text matching in real-world applications is hampered by
noisy correspondence. Manually curating high-quality datasets is expensive and time …
noisy correspondence. Manually curating high-quality datasets is expensive and time …
Senet: spatial information enhancement for semantic segmentation neural networks
Y Huang, P Shi, H He, H He, B Zhao - The Visual Computer, 2024 - Springer
Image semantic segmentation is a basic task of computer vision, and plays an important role
in automatic driving, robot navigation and many other fields. However, the expensive …
in automatic driving, robot navigation and many other fields. However, the expensive …
Noise-robust Vision-language Pre-training with Positive-negative Learning
Vision-Language Pre-training (VLP) has shown promising performance in various tasks by
learning a generic image-text representation space. However, most existing VLP methods …
learning a generic image-text representation space. However, most existing VLP methods …
Semantic-aware Contrastive Learning with Proposal Suppression for Video Semantic Role Grounding
Video semantic role grounding has gained substantial interest from both the academic and
industrial communities. While existing methods have demonstrated considerable …
industrial communities. While existing methods have demonstrated considerable …
Semi-supervised semi-paired cross-modal hashing
Large-scale cross-modal hashing has drawn extensive attention due to its attractive
efficiency in both storage and retrieval. Existing methods exhibit poor performance when …
efficiency in both storage and retrieval. Existing methods exhibit poor performance when …