Docile benchmark for document information localization and extraction

Š Šimsa, M Šulc, M Uřičář, Y Patel, A Hamdi… - … on Document Analysis …, 2023 - Springer
This paper introduces the DocILE benchmark with the largest dataset of business
documents for the tasks of Key Information Localization and Extraction and Line Item …

Overview of DocILE 2023: Document Information Localization and Extraction

Š Šimsa, M Uřičář, M Šulc, Y Patel, A Hamdi… - … Conference of the Cross …, 2023 - Springer
This paper provides an overview of the DocILE 2023 Competition, its tasks, participant
submissions, the competition results and possible future research directions. This first …

Deep Learning based Key Information Extraction from Business Documents: Systematic Literature Review

A Rombach, P Fettke - arxiv preprint arxiv:2408.06345, 2024 - arxiv.org
Extracting key information from documents represents a large portion of business workloads
and therefore offers a high potential for efficiency improvements and process automation …

Deep learning approaches for information extraction from visually rich documents: datasets, challenges and methods

H Gbada, K Kalti, MA Mahjoub - International Journal on Document …, 2024 - Springer
This paper focuses on Information Extraction from Visually Rich Documents, exploring how
deep learning methods are applied in this field. For the purpose of comparing the …

[PDF][PDF] USTC-iFLYTEK at DocILE: A Multi-modal Approach Using Domain-specific GraphDoc.

Y Wang, J Du, J Ma, P Hu, Z Zhang, J Zhang - CLEF (Working Notes), 2023 - ceur-ws.org
With the development of digitalization in business, the automatic extraction of information
from semistructured business documents is becoming increasingly important. This paper …

An ID Badge Information Extractor Based on Object Detection and Optical Character Recognition

W Cavalcante, I Torné, L Camelo, R Fernandes… - IEEE …, 2024 - ieeexplore.ieee.org
Advancements in Artificial Intelligence and Deep Learning have impacted numerous fields,
particularly through innovations like You Only Look Once for object detection and …

What Happened in CLEF For Another While?

N Ferro - International Conference of the Cross-Language …, 2024 - Springer
Abstract 2024 marks the 25 th birthday for CLEF, an evaluation campaign activity which has
applied the Cranfield evaluation paradigm to the testing of multilingual and multimodal …

Retrieval Augmented Structured Generation: Business Document Information Extraction As Tool Use

FL Cesista, R Aguiar, J Kim, P Acilo - arxiv preprint arxiv:2405.20245, 2024 - arxiv.org
Business Document Information Extraction (BDIE) is the problem of transforming a blob of
unstructured information (raw text, scanned documents, etc.) into a structured format that …

Comparing state of the art rule-based tools for information extraction

D Lembo, FM Scafoglieri - International Joint Conference on Rules and …, 2023 - Springer
In this paper, we present a comparative analysis of the leading rule-based information
extraction systems in both research and industry, focusing on their main characteristics and …

GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks

I Stepanov, M Shtopko - arxiv preprint arxiv:2406.12925, 2024 - arxiv.org
Information extraction tasks require both accurate, efficient, and generalisable models.
Classical supervised deep learning approaches can achieve the required performance, but …