Cross-modal retrieval: a systematic review of methods and future directions

T Wang, F Li, L Zhu, J Li, Z Zhang… - Proceedings of the …, 2025‏ - ieeexplore.ieee.org
With the exponential surge in diverse multimodal data, traditional unimodal retrieval
methods struggle to meet the needs of users seeking access to data across various …

A survey of efficient fine-tuning methods for vision-language models—prompt and adapter

J **_With_Transformer_for_Referring_Image_Segmentation_CVPR_2023_paper.pdf" data-clk="hl=ar&sa=T&oi=gga&ct=gga&cd=8&d=14019609003185524016&ei=CNi5Z_bZNbutieoP7fLpuQI" data-clk-atid="MM29_Buoj8IJ" target="_blank">[PDF] thecvf.com

Contrastive grou** with transformer for referring image segmentation

J Tang, G Zheng, C Shi, S Yang - Proceedings of the IEEE …, 2023‏ - openaccess.thecvf.com
Referring image segmentation aims to segment the target referent in an image conditioning
on a natural language expression. Existing one-stage methods employ per-pixel …

Cross-modal self-attention network for referring image segmentation

L Ye, M Rochan, Z Liu, Y Wang - Proceedings of the IEEE …, 2019‏ - openaccess.thecvf.com
We consider the problem of referring image segmentation. Given an input image and a
natural language expression, the goal is to segment the object referred by the language …