- Academic Search

E Davis - ACM Computing Surveys, 2023 - dl.acm.org

More than one hundred benchmarks have been developed to test the commonsense
knowledge and commonsense reasoning abilities of artificial intelligence (AI) systems …

บันทึก อ้างอิง อ้างโดย64 บทความที่เกี่ยวข้อง ทั้งหมด 4 ฉบับ

[Free GPT-4]
[DeepSeek]

[PDF] sciencedirect.com

Multimodal research in vision and language: A review of current and emerging trends

S Uppal, S Bhagat, D Hazarika, N Majumder, S Poria… - Information …, 2022 - Elsevier

Deep Learning and its applications have cascaded impactful research and development
with a diverse range of modalities present in the real-world data. More recently, this has …

บันทึก อ้างอิง อ้างโดย107 บทความที่เกี่ยวข้อง ทั้งหมด 5 ฉบับ

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Query and attention augmentation for knowledge-based explainable reasoning

Y Zhang, M Jiang, Q Zhao - … of the IEEE/CVF Conference on …, 2022 - openaccess.thecvf.com

Explainable visual question answering (VQA) models have been developed with neural
modules and query-based knowledge incorporation to answer knowledge-requiring …

บันทึก อ้างอิง อ้างโดย20 บทความที่เกี่ยวข้อง ทั้งหมด 7 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] aaai.org

Dynamic key-value memory enhanced multi-step graph reasoning for knowledge-based visual question answering

M Li, MF Moens - Proceedings of the AAAI Conference on Artificial …, 2022 - ojs.aaai.org

Abstract Knowledge-based visual question answering (VQA) is a vision-language task that
requires an agent to correctly answer image-related questions using knowledge that is not …

บันทึก อ้างอิง อ้างโดย23 บทความที่เกี่ยวข้อง ทั้งหมด 8 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

A survey on knowledge-enhanced multimodal learning

M Lymperaiou, G Stamou - Artificial Intelligence Review, 2024 - Springer

Multimodal learning has been a field of increasing interest, aiming to combine various
modalities in a single joint representation. Especially in the area of visiolinguistic (VL) …

บันทึก อ้างอิง อ้างโดย13 บทความที่เกี่ยวข้อง ทั้งหมด 6 ฉบับ

Knowledge is power: Open-world knowledge representation learning for knowledge-based visual reasoning

W Zheng, L Yan, FY Wang - Artificial Intelligence, 2024 - Elsevier

Abstract Knowledge-based visual reasoning requires the ability to associate outside
knowledge that is not present in a given image for cross-modal visual understanding. Two …

บันทึก อ้างอิง อ้างโดย3 บทความที่เกี่ยวข้อง ทั้งหมด 3 ฉบับ

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A survey on interpretable cross-modal reasoning

D Xue, S Qian, Z Zhou, C Xu - arxiv preprint arxiv:2309.01955, 2023 - arxiv.org

In recent years, cross-modal reasoning (CMR), the process of understanding and reasoning
across different modalities, has emerged as a pivotal area with applications spanning from …

บันทึก อ้างอิง อ้างโดย5 บทความที่เกี่ยวข้อง ทั้งหมด 2 ฉบับ ดูในรูปแบบ HTML

Interpretable visual reasoning: A survey

F He, Y Wang, X Miao, X Sun - Image and Vision Computing, 2021 - Elsevier

Visual reasoning refers to the process of solving questions about visual information. At
present, most visual reasoning models are mainly based on deep learning and end-to-end …

บันทึก อ้างอิง อ้างโดย16 บทความที่เกี่ยวข้อง ทั้งหมด 2 ฉบับ

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

VQA with no questions-answers training

BZ Vatashsky, S Ullman - … of the IEEE/CVF Conference on …, 2020 - openaccess.thecvf.com

Methods for teaching machines to answer visual questions have made significant progress
in recent years, but current methods still lack important human capabilities, including …

บันทึก อ้างอิง อ้างโดย15 บทความที่เกี่ยวข้อง ทั้งหมด 9 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

Towards One-to-Many Visual Question Answering

H Ji, Q Si, Z Lin, Y Cao, W Wang - Findings of the Association for …, 2024 - aclanthology.org

Abstract Most existing Visual Question Answering (VQA) systems are constrained to support
domain-specific questions, ie, to train different models separately for different VQA tasks …

บันทึก อ้างอิง บทความที่เกี่ยวข้อง ดูในรูปแบบ HTML

สร้างการแจ้งเตือน

อ้างอิง

การค้นหาขั้นสูง

บันทึกไปยังคลังของฉันแล้ว

Explainable high-order visual question reasoning: A new benchmark and knowledge-routed network

Benchmarks for automated commonsense reasoning: A survey

Multimodal research in vision and language: A review of current and emerging trends

Query and attention augmentation for knowledge-based explainable reasoning

Dynamic key-value memory enhanced multi-step graph reasoning for knowledge-based visual question answering

A survey on knowledge-enhanced multimodal learning

Knowledge is power: Open-world knowledge representation learning for knowledge-based visual reasoning

A survey on interpretable cross-modal reasoning

Interpretable visual reasoning: A survey

VQA with no questions-answers training

Towards One-to-Many Visual Question Answering