Composing object relations and attributes for image-text matching

K Pham, C Huynh, SN Lim… - Proceedings of the …, 2024‏ - openaccess.thecvf.com
We study the visual semantic embedding problem for image-text matching. Most existing
work utilizes a tailored cross-attention mechanism to perform local alignment across the two …

Multilateral semantic relations modeling for image text retrieval

Z Wang, Z Gao, K Guo, Y Yang… - Proceedings of the …, 2023‏ - openaccess.thecvf.com
Image-text retrieval is a fundamental task to bridge vision and language by exploiting
various strategies to fine-grained alignment between regions and words. This is still tough …

3SHNet: Boosting image–sentence retrieval via visual semantic–spatial self-highlighting

X Ge, S Xu, F Chen, J Wang, G Wang, S An… - Information Processing & …, 2024‏ - Elsevier
In this paper, we propose a novel visual Semantic-Spatial Self-Highlighting Network (termed
3SHNet) for high-precision, high-efficiency and high-generalization image–sentence …

Cross-modal semantic enhanced interaction for image-sentence retrieval

X Ge, F Chen, S Xu, F Tao… - Proceedings of the IEEE …, 2023‏ - openaccess.thecvf.com
Image-sentence retrieval has attracted extensive research attention in multimedia and
computer vision due to its promising application. The key issue lies in jointly learning the …

MKVSE: Multimodal knowledge enhanced visual-semantic embedding for image-text retrieval

D Feng, X He, Y Peng - ACM Transactions on Multimedia Computing …, 2023‏ - dl.acm.org
Image-text retrieval aims to take the text (image) query to retrieve the semantically relevant
images (texts), which is fundamental and critical in the search system, online shop**, and …

Geometric matching for cross-modal retrieval

Z Wang, Z Gao, Y Yang, G Wang… - IEEE Transactions on …, 2024‏ - ieeexplore.ieee.org
Despite its significant progress, cross-modal retrieval still suffers from one-to-many matching
cases, where the multiplicity of semantic instances in another modality could be acquired by …

ESA: External space attention aggregation for image-text retrieval

H Zhu, C Zhang, Y Wei, S Huang… - IEEE Transactions on …, 2023‏ - ieeexplore.ieee.org
Due to the large gap between vision and language modalities, effective and efficient image-
text retrieval is still an unsolved problem. Recent progress devotes to unilaterally pursuing …

Point to rectangle matching for image text retrieval

Z Wang, Z Gao, X Xu, Y Luo, Y Yang… - Proceedings of the 30th …, 2022‏ - dl.acm.org
The difficulty of image-text retrieval is further exacerbated by the phenomenon of one-to-
many correspondence, where multiple semantic manifestations of the other modality could …

Reservoir computing transformer for image-text retrieval

W Li, Z Ma, LJ Deng, P Wang, J Shi, X Fan - Proceedings of the 31st …, 2023‏ - dl.acm.org
Although the attention mechanism in transformers has proven successful in image-text
retrieval tasks, most transformer models suffer from a large number of parameters. Inspired …

CFIR: Fast and Effective Long-Text To Image Retrieval for Large Corpora

Z Long, X Ge, R McCreadie, JM Jose - Proceedings of the 47th …, 2024‏ - dl.acm.org
Text-to-image retrieval aims to find the relevant images based on a text query, which is
important in various use-cases, such as digital libraries, e-commerce, and multimedia …