- Academic Search

Y Ding, S Luo, H Chung, SC Han - Joint European Conference on …, 2023 - Springer

Abstract Document-based Visual Question Answering examines the document
understanding of document images in conditions of natural language questions. We …

Uložit Citovat Počet citací tohoto článku: 21 Související články Všechny verze (počet: 6)

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

A survey of recent approaches to form understanding in scanned documents

A Abdallah, D Eberharter, Z Pfister, A Jatowt - Artificial Intelligence Review, 2024 - Springer

This paper presents a comprehensive survey of over 100 research works on the topic of form
understanding in the context of scanned documents. We delve into recent advancements …

Uložit Citovat Počet citací tohoto článku: 1 Související články Všechny verze (počet: 2)

Towards Multi-modal Interpretation and Explanation

S Luo - 2023 - ses.library.usyd.edu.au

Multimodal task processes on different modalities simultaneously. Visual Question
Answering, as a type of multimodal task, aims to answer the natural question answering …

Uložit Citovat Související články Archiv

Vytvořit upozornění

Citovat

Rozšířené vyhledávání

Uloženo do Mojí knihovny

Form-nlu: Dataset for the form language understanding

VQA: A new dataset for real-world VQA on PDF documents

A survey of recent approaches to form understanding in scanned documents

Towards Multi-modal Interpretation and Explanation