- Academic Search

Viske: Visual knowledge extraction and question answering by visual verification of relation phrases

R Krishna, Y Zhu, O Groth, J Johnson, K Hata… - International journal of …, 2017 - Springer

Despite progress in perceptual tasks such as image classification, computers still perform
poorly on cognitive tasks such as image description and question answering. Cognition is …

Save Cite Cited by 6290 Related articles All 14 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Vqa: Visual question answering

S Antol, A Agrawal, J Lu, M Mitchell… - Proceedings of the …, 2015 - openaccess.thecvf.com

We propose the task of free-form and open-ended Visual Question Answering (VQA). Given
an image and a natural language question about the image, the task is to provide an …

Save Cite Cited by 6721 Related articles All 32 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Knowledge graphs meet multi-modal learning: A comprehensive survey

Z Chen, Y Zhang, Y Fang, Y Geng, L Guo… - arxiv preprint arxiv …, 2024 - arxiv.org

Knowledge Graphs (KGs) play a pivotal role in advancing various AI applications, with the
semantic web community's exploration into multi-modal dimensions unlocking new avenues …

Save Cite Cited by 44 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Ok-vqa: A visual question answering benchmark requiring external knowledge

K Marino, M Rastegari, A Farhadi… - Proceedings of the …, 2019 - openaccess.thecvf.com

Abstract Visual Question Answering (VQA) in its ideal form lets us study reasoning in the
joint space of vision and language and serves as a proxy for the AI task of scene …

Save Cite Cited by 1033 Related articles All 8 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Neural motifs: Scene graph parsing with global context

R Zellers, M Yatskar, S Thomson… - Proceedings of the …, 2018 - openaccess.thecvf.com

We investigate the problem of producing structured graph representations of visual scenes.
Our work analyzes the role of motifs: regularly appearing substructures in scene graphs. We …

Save Cite Cited by 1166 Related articles All 9 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Krisp: Integrating implicit and symbolic knowledge for open-domain knowledge-based vqa

K Marino, X Chen, D Parikh, A Gupta… - Proceedings of the …, 2021 - openaccess.thecvf.com

One of the most challenging question types in VQA is when answering the question requires
outside knowledge not present in the image. In this work we study open-domain knowledge …

Save Cite Cited by 237 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Visual translation embedding network for visual relation detection

H Zhang, Z Kyaw, SF Chang… - Proceedings of the IEEE …, 2017 - openaccess.thecvf.com

Visual relations, such as" person ride bike" and" bike next to car", offer a comprehensive
scene understanding of an image, and have already shown their great utility in connecting …

Save Cite Cited by 664 Related articles All 8 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Human‐centered artificial intelligence and machine learning

MO Riedl - Human behavior and emerging technologies, 2019 - Wiley Online Library

Humans are increasingly coming into contact with artificial intelligence (AI) and machine
learning (ML) systems. Human‐centered AI is a perspective on AI and ML that algorithms …

Save Cite Cited by 441 Related articles All 6 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Visual commonsense r-cnn

T Wang, J Huang, H Zhang… - Proceedings of the IEEE …, 2020 - openaccess.thecvf.com

We present a novel unsupervised feature representation learning method, Visual
Commonsense Region-based Convolutional Neural Network (VC R-CNN), to serve as an …

Save Cite Cited by 323 Related articles All 10 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] springer.com

Cross-media analysis and reasoning: advances and directions

Y Peng, W Zhu, Y Zhao, C Xu, Q Huang, H Lu… - Frontiers of Information …, 2017 - Springer

Cross-media analysis and reasoning is an active research area in computer science, and a
promising direction for artificial intelligence. However, to the best of our knowledge, no …

Save Cite Cited by 112 Related articles All 11 versions Free GPT-4

Create alert

Cite

Advanced search

Saved to My library

Viske: Visual knowledge extraction and question answering by visual verification of relation phrases

Visual genome: Connecting language and vision using crowdsourced dense image annotations

Vqa: Visual question answering

Knowledge graphs meet multi-modal learning: A comprehensive survey

Ok-vqa: A visual question answering benchmark requiring external knowledge

Neural motifs: Scene graph parsing with global context

Krisp: Integrating implicit and symbolic knowledge for open-domain knowledge-based vqa

Visual translation embedding network for visual relation detection

Human‐centered artificial intelligence and machine learning

Visual commonsense r-cnn

Cross-media analysis and reasoning: advances and directions