- Academic Search

Z Pan, F Wu, B Zhang - … of the IEEE/CVF conference on …, 2023 - openaccess.thecvf.com

Current state-of-the-art image-text matching methods implicitly align the visual-semantic
fragments, like regions in images and words in sentences, and adopt cross-attention …

Speichern Zitieren Zitiert von: 53 Ähnliche Artikel Alle 6 Versionen HTML-Version

[Free GPT-4]

[PDF] thecvf.com

Learning semantic relationship among instances for image-text matching

Z Fu, Z Mao, Y Song, Y Zhang - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Image-text matching, a bridge connecting image and language, is an important task, which
generally learns a holistic cross-modal embedding to achieve a high-quality semantic …

Speichern Zitieren Zitiert von: 49 Ähnliche Artikel Alle 5 Versionen HTML-Version

[Free GPT-4]

[PDF] neurips.cc

Cross-modal active complementary learning with self-refining correspondence

Y Qin, Y Sun, D Peng, JT Zhou… - Advances in Neural …, 2023 - proceedings.neurips.cc

Recently, image-text matching has attracted more and more attention from academia and
industry, which is fundamental to understanding the latent correspondence across visual …

Speichern Zitieren Zitiert von: 22 Ähnliche Artikel Alle 7 Versionen HTML-Version

[Free GPT-4]

[PDF] thecvf.com

Cross-modal semantic enhanced interaction for image-sentence retrieval

X Ge, F Chen, S Xu, F Tao… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Image-sentence retrieval has attracted extensive research attention in multimedia and
computer vision due to its promising application. The key issue lies in jointly learning the …

Speichern Zitieren Zitiert von: 38 Ähnliche Artikel Alle 8 Versionen HTML-Version

Interacting-enhancing feature transformer for cross-modal remote-sensing image and text retrieval

X Tang, Y Wang, J Ma, X Zhang, F Liu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Cross-modal remote-sensing image–text retrieval (CMRSITR) is a challenging topic in the
remote-sensing (RS) community. It has gained growing attention because it can be flexibly …

Speichern Zitieren Zitiert von: 35 Ähnliche Artikel Alle 2 Versionen

Quaternion relation embedding for scene graph generation

Z Wang, X Xu, G Wang, Y Yang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

As an important visual understanding task, scene graph generation has been drawing
widespread attention and could boost a broad range of downstream vision applications …

Speichern Zitieren Zitiert von: 22 Ähnliche Artikel Alle 2 Versionen

[Free GPT-4]

[PDF] google.com

Esa: External space attention aggregation for image-text retrieval

H Zhu, C Zhang, Y Wei, S Huang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Due to the large gap between vision and language modalities, effective and efficient image-
text retrieval is still an unsolved problem. Recent progress devotes to unilaterally pursuing …

Speichern Zitieren Zitiert von: 24 Ähnliche Artikel Alle 2 Versionen

Transferring image-clip to video-text retrieval via temporal relations

H Fang, P **, and …

Speichern Zitieren Zitiert von: 31 Ähnliche Artikel Alle 2 Versionen

Quaternion representation learning for cross-modal matching

Z Wang, X Xu, J Wei, N **e, J Shao, Y Yang - Knowledge-Based Systems, 2023 - Elsevier

The main challenge of cross-modal matching is to construct a shared subspace reflecting
semantic closeness. Asymmetric relevance, especially the one-to-many matching case …

Speichern Zitieren Zitiert von: 11 Ähnliche Artikel Alle 2 Versionen

Alert erstellen

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

Image-text embedding learning via visual and textual semantic reasoning

Fine-grained image-text matching by cross-modal hard aligning network

Learning semantic relationship among instances for image-text matching

Cross-modal active complementary learning with self-refining correspondence

Cross-modal semantic enhanced interaction for image-sentence retrieval

Interacting-enhancing feature transformer for cross-modal remote-sensing image and text retrieval

Quaternion relation embedding for scene graph generation

Esa: External space attention aggregation for image-text retrieval

Transferring image-clip to video-text retrieval via temporal relations

Quaternion representation learning for cross-modal matching