- Academic Search

X Wang, G Chen, G Qian, P Gao, XY Wei… - Machine Intelligence …, 2023 - Springer

With the urgent demand for generalized deep models, many pre-trained big models are
proposed, such as bidirectional encoder representations (BERT), vision transformer (ViT) …

Simpan Kutip Dirujuk 196 kali Artikel terkait 8 versi

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Deep learning for person re-identification: A survey and outlook

M Ye, J Shen, G Lin, T ** cameras. With the advancement of deep neural networks and increasing …

Simpan Kutip Dirujuk 2056 kali Artikel terkait 12 versi

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Cross-modal implicit relation reasoning and aligning for text-to-image person retrieval

D Jiang, M Ye - Proceedings of the IEEE/CVF Conference …, 2023 - openaccess.thecvf.com

Text-to-image person retrieval aims to identify the target person based on a given textual
description query. The primary challenge is to learn the map** of visual and textual …

Simpan Kutip Dirujuk 175 kali Artikel terkait 5 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Open-vocabulary detr with conditional matching

Y Zang, W Li, K Zhou, C Huang, CC Loy - European Conference on …, 2022 - Springer

Open-vocabulary object detection, which is concerned with the problem of detecting novel
objects guided by natural language, has gained increasing attention from the community …

Simpan Kutip Dirujuk 223 kali Artikel terkait 6 versi

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

What Matters in Training a GPT4-Style Language Model with Multimodal Inputs?

Y Zeng, H Zhang, J Zheng, J ** between visual-textual modalities. To …

Simpan Kutip Dirujuk 116 kali Artikel terkait 6 versi

Buat notifikasi

Kutip

Penelusuran lanjutan

Disimpan ke Koleksi saya

Person search with natural language description

Large-scale multi-modal pre-trained models: A comprehensive survey

Deep learning for person re-identification: A survey and outlook

Cross-modal implicit relation reasoning and aligning for text-to-image person retrieval

Open-vocabulary detr with conditional matching

What Matters in Training a GPT4-Style Language Model with Multimodal Inputs?