- Academic Search

K Saito, K Sohn, X Zhang, CL Li… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract In Composed Image Retrieval (CIR), a user combines a query image with text to
describe their intended target. Existing methods rely on supervised learning of CIR models …

Spara Citera Citerat av 109 Relaterade artiklar Alla 11 versionerna Se som HTML-version

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Genecis: A benchmark for general conditional image similarity

S Vaze, N Carion, I Misra - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com

We argue that there are many notions of'similarity'and that models, like humans, should be
able to adapt to these dynamically. This contrasts with most representation learning …

Spara Citera Citerat av 26 Relaterade artiklar Alla 9 versionerna Se som HTML-version

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Probvlm: Probabilistic adapter for frozen vison-language models

U Upadhyay, S Karthik, M Mancini… - Proceedings of the …, 2023 - openaccess.thecvf.com

Large-scale vision-language models (VLMs) like CLIP successfully find correspondences
between images and text. Through the standard deterministic map** process, an image or …

Spara Citera Citerat av 22 Relaterade artiklar Alla 10 versionerna Se som HTML-version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Improved probabilistic image-text representations

S Chun - arxiv preprint arxiv:2305.18171, 2023 - arxiv.org

Image-Text Matching (ITM) task, a fundamental vision-language (VL) task, suffers from the
inherent ambiguity arising from multiplicity and imperfect annotations. Deterministic …

Spara Citera Citerat av 28 Relaterade artiklar Alla 5 versionerna Se som HTML-version

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Probabilistic contrastive learning recovers the correct aleatoric uncertainty of ambiguous inputs

M Kirchhof, E Kasneci, SJ Oh - International Conference on …, 2023 - proceedings.mlr.press

Contrastively trained encoders have recently been proven to invert the data-generating
process: they encode each input, eg, an image, into the true latent vector that generated the …

Spara Citera Citerat av 20 Relaterade artiklar Alla 8 versionerna Se som HTML-version

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Attribute-guided pedestrian retrieval: Bridging person re-id with internal attribute variability

Y Huang, Z Zhang, Q Wu, Y Zhong… - Proceedings of the …, 2024 - openaccess.thecvf.com

In various domains such as surveillance and smart retail pedestrian retrieval centering on
person re-identification (Re-ID) plays a pivotal role. Existing Re-ID methodologies often …

Spara Citera Citerat av 4 Relaterade artiklar Alla 4 versionerna Se som HTML-version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Hierarchical matching and reasoning for multi-query image retrieval

Z Ji, Z Li, Y Zhang, H Wang, Y Pang, X Li - Neural Networks, 2024 - Elsevier

As a promising field, Multi-Query Image Retrieval (MQIR) aims at searching for the
semantically relevant image given multiple region-specific text queries. Existing works …

Spara Citera Citerat av 9 Relaterade artiklar Alla 9 versionerna

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Diffcad: Weakly-supervised probabilistic cad model retrieval and alignment from an rgb image

D Gao, D Rozenberszki, S Leutenegger… - ACM Transactions on …, 2024 - dl.acm.org

Perceiving 3D structures from RGB images based on CAD model primitives can enable an
effective, efficient 3D object-based representation of scenes. However, current approaches …

Spara Citera Citerat av 4 Relaterade artiklar Alla 3 versionerna

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Robust multimodal learning via representation decoupling

S Wei, Y Luo, Y Wang, C Luo - European Conference on Computer Vision, 2024 - Springer

Multimodal learning robust to missing modality has attracted increasing attention due to its
practicality. Existing methods tend to address it by learning a common subspace …

Spara Citera Citerat av 2 Relaterade artiklar Alla 8 versionerna

[Free GPT-4]
[DeepSeek]

[PDF] github.io

Real20M: A large-scale e-commerce dataset for cross-domain retrieval

Y Chen, H Zhong, X He, Y Peng, L Cheng - Proceedings of the 31st ACM …, 2023 - dl.acm.org

In e-commerce, products and micro-videos serve as two primary carriers. Introducing cross-
domain retrieval between these carriers can establish associations, thereby leading to the …

Spara Citera Citerat av 8 Relaterade artiklar Alla 2 versionerna

Skapa alarm

Citera

Avancerad sökning

Har sparats i Mitt bibliotek

Probabilistic compositional embeddings for multimodal image retrieval

Pic2word: Map** pictures to words for zero-shot composed image retrieval

Genecis: A benchmark for general conditional image similarity

Probvlm: Probabilistic adapter for frozen vison-language models

Improved probabilistic image-text representations

Probabilistic contrastive learning recovers the correct aleatoric uncertainty of ambiguous inputs

Attribute-guided pedestrian retrieval: Bridging person re-id with internal attribute variability

Hierarchical matching and reasoning for multi-query image retrieval

Diffcad: Weakly-supervised probabilistic cad model retrieval and alignment from an rgb image

Robust multimodal learning via representation decoupling

Real20M: A large-scale e-commerce dataset for cross-domain retrieval