Google znalac

T Jaunet, C Kervadec, R Vuillemot… - … on Visualization and …, 2021 - ieeexplore.ieee.org

Visual Question Answering systems target answering open-ended textual questions given
input images. They are a testbed for learning high-level reasoning with a primary use in HCI …

Spremi Citiraj Spominje se 36 puta Srodni članci Svih 10 inačica

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

A critical analysis of benchmarks, techniques, and models in medical visual question answering

S Al-Hadhrami, MEB Menai, S Al-Ahmadi… - IEEE …, 2023 - ieeexplore.ieee.org

This paper comprehensively reviews medical VQA models, structures, and datasets,
focusing on combining vision and language. Over 75 models and their statistical and SWOT …

Spremi Citiraj Spominje se 2 puta Srodni članci Svih 2 inačica

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Unsupervised and pseudo-supervised vision-language alignment in visual dialog

F Chen, D Zhang, X Chen, J Shi, S Xu… - Proceedings of the 30th …, 2022 - dl.acm.org

Visual dialog requires models to give reasonable answers according to a series of coherent
questions and related visual concepts in images. However, most current work either focuses …

Spremi Citiraj Spominje se 16 puta Srodni članci

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Weakly supervised relative spatial reasoning for visual question answering

P Banerjee, T Gokhale, Y Yang… - Proceedings of the …, 2021 - openaccess.thecvf.com

Abstract Vision-and-language (V&L) reasoning necessitates perception of visual concepts
such as objects and actions, understanding semantics and language grounding, and …

Spremi Citiraj Spominje se 25 puta Srodni članci Svih 8 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

How transferable are reasoning patterns in VQA?

C Kervadec, T Jaunet, G Antipov… - Proceedings of the …, 2021 - openaccess.thecvf.com

Abstract Since its inception, Visual Question Answering (VQA) is notoriously known as a
task, where models are prone to exploit biases in datasets to find shortcuts instead of …

Spremi Citiraj Spominje se 34 puta Srodni članci Svih 12 inačica Prikaži kao HTML

Knowledge-embedded mutual guidance for visual reasoning

W Zheng, L Yan, L Chen, Q Li… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Visual reasoning between visual images and natural language is a long-standing challenge
in computer vision. Most of the methods aim to look for answers to questions only on the …

Spremi Citiraj Spominje se 5 puta Srodni članci Svih 5 inačica

Self-attention guided representation learning for image-text matching

X Qi, Y Zhang, J Qi, H Lu - Neurocomputing, 2021 - Elsevier

Image-text matching plays an important role in bridging vision and language. Most existing
research works embed both images and sentences into a joint latent space to measure their …

Spremi Citiraj Spominje se 19 puta Srodni članci

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation

C Wu, Q Chen, J Ji, H Wang, Y Ma, Y Huang… - arxiv preprint arxiv …, 2024 - arxiv.org

3D Referring Expression Segmentation (3D-RES) aims to segment 3D objects by correlating
referring expressions with point clouds. However, traditional approaches frequently …

Spremi Citiraj Spominje se 1 puta Srodni članci Svih 3 inačica Prikaži kao HTML

Webly supervised knowledge-embedded model for visual reasoning

W Zheng, L Yan, W Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Visual reasoning between visual images and natural language remains a long-standing
challenge in computer vision. Conventional deep supervision methods target at finding …

Spremi Citiraj Spominje se 6 puta Srodni članci Svih 4 inačica

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Supervising the transfer of reasoning patterns in vqa

C Kervadec, C Wolf, G Antipov… - Advances in Neural …, 2021 - proceedings.neurips.cc

Abstract Methods for Visual Question Anwering (VQA) are notorious for leveraging dataset
biases rather than performing reasoning, hindering generalization. It has been recently …

Spremi Citiraj Spominje se 10 puta Srodni članci Svih 8 inačica Prikaži kao HTML

Stvori obavijest

Citiraj

Napredno pretraživanje

Spremljeno u Moju knjižnicu

Weak supervision helps emergence of word-object alignment and improves vision-language tasks

Visqa: X-raying vision and language reasoning in transformers

A critical analysis of benchmarks, techniques, and models in medical visual question answering

Unsupervised and pseudo-supervised vision-language alignment in visual dialog

Weakly supervised relative spatial reasoning for visual question answering

How transferable are reasoning patterns in VQA?

Knowledge-embedded mutual guidance for visual reasoning

Self-attention guided representation learning for image-text matching

RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation

Webly supervised knowledge-embedded model for visual reasoning

Supervising the transfer of reasoning patterns in vqa