محقق Google

F Pourpanah, M Abdar, Y Luo, X Zhou… - IEEE transactions on …, 2022‏ - ieeexplore.ieee.org‏

Generalized zero-shot learning (GZSL) aims to train a model for classifying data samples
under the condition that some output classes are unknown during supervised learning. To …‏

ذخیره ارجاع بیان شده در 447 یافته مقاله‌های مربوط تمام نسخه‌های 11

[Free GPT-4]
[DeepSeek]

[HTML] sciencedirect.com

[HTML][HTML] Scene graph generation: A comprehensive survey‏

H Li, G Zhu, L Zhang, Y Jiang, Y Dang, H Hou, P Shen… - Neurocomputing, 2024‏ - Elsevier‏

Deep learning techniques have led to remarkable breakthroughs in the field of object
detection and have spawned a lot of scene-understanding tasks in recent years. Scene …‏

ذخیره ارجاع بیان شده در 114 یافته مقاله‌های مربوط تمام نسخه‌های 8

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Panoptic scene graph generation‏

J Yang, YZ Ang, Z Guo, K Zhou, W Zhang… - European Conference on …, 2022‏ - Springer‏

Existing research addresses scene graph generation (SGG)—a critical technology for scene
understanding in images—from a detection perspective, ie., objects are detected using …‏

ذخیره ارجاع بیان شده در 123 یافته مقاله‌های مربوط تمام نسخه‌های 5

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Teaching structured vision & language concepts to vision & language models‏

S Doveh, A Arbelle, S Harary… - Proceedings of the …, 2023‏ - openaccess.thecvf.com‏

Vision and Language (VL) models have demonstrated remarkable zero-shot performance in
a variety of tasks. However, some aspects of complex language understanding still remain a …‏

ذخیره ارجاع بیان شده در 80 یافته مقاله‌های مربوط تمام نسخه‌های 10 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Clip-event: Connecting text and images with event structures‏

M Li, R Xu, S Wang, L Zhou, X Lin… - Proceedings of the …, 2022‏ - openaccess.thecvf.com‏

Abstract Vision-language (V+ L) pretraining models have achieved great success in
supporting multimedia applications by understanding the alignments between images and …‏

ذخیره ارجاع بیان شده در 140 یافته مقاله‌های مربوط تمام نسخه‌های 9 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

H2o: Two hands manipulating objects for first person interaction recognition‏

T Kwon, B Tekin, J Stühmer, F Bogo… - Proceedings of the …, 2021‏ - openaccess.thecvf.com‏

We present a comprehensive framework for egocentric interaction recognition using
markerless 3D annotations of two hands manipulating objects. To this end, we propose a …‏

ذخیره ارجاع بیان شده در 176 یافته مقاله‌های مربوط تمام نسخه‌های 8 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Compositional feature augmentation for unbiased scene graph generation‏

L Li, G Chen, J **ao, Y Yang… - Proceedings of the …, 2023‏ - openaccess.thecvf.com‏

Abstract Scene Graph Generation (SGG) aims to detect all the visual relation triplets< sub,
pred, obj> in a given image. With the emergence of various advanced techniques for better …‏

ذخیره ارجاع بیان شده در 37 یافته مقاله‌های مربوط تمام نسخه‌های 8 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Drg: Dual relation graph for human-object interaction detection‏

C Gao, J Xu, Y Zou, JB Huang - … Conference, Glasgow, UK, August 23–28 …, 2020‏ - Springer‏

We tackle the challenging problem of human-object interaction (HOI) detection. Existing
methods either recognize the interaction of each human-object pair in isolation or perform …‏

ذخیره ارجاع بیان شده در 253 یافته مقاله‌های مربوط تمام نسخه‌های 5

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Dense and aligned captions (dac) promote compositional reasoning in vl models‏

S Doveh, A Arbelle, S Harary… - Advances in …, 2023‏ - proceedings.neurips.cc‏

Vision and Language (VL) models offer an effective method for aligning representation
spaces of images and text allowing for numerous applications such as cross-modal retrieval …‏

ذخیره ارجاع بیان شده در 43 یافته مقاله‌های مربوط تمام نسخه‌های 10 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Composing text and image for image retrieval-an empirical odyssey‏

N Vo, L Jiang, C Sun, K Murphy, LJ Li… - Proceedings of the …, 2019‏ - openaccess.thecvf.com‏

In this paper, we study the task of image retrieval, where the input query is specified in the
form of an image plus some text that describes desired modifications to the input image. For …‏

ذخیره ارجاع بیان شده در 402 یافته مقاله‌های مربوط تمام نسخه‌های 8 نسخه HTML

ایجاد هشدار

ارجاع

جستجوی پیشرفته

در «کتابخانه من» ذخیره شد

Compositional learning for human object interaction

A review of generalized zero-shot learning methods‏

[HTML][HTML] Scene graph generation: A comprehensive survey‏

Panoptic scene graph generation‏

Teaching structured vision & language concepts to vision & language models‏

Clip-event: Connecting text and images with event structures‏

H2o: Two hands manipulating objects for first person interaction recognition‏

Compositional feature augmentation for unbiased scene graph generation‏

Drg: Dual relation graph for human-object interaction detection‏

Dense and aligned captions (dac) promote compositional reasoning in vl models‏

Composing text and image for image retrieval-an empirical odyssey‏