- Academic Search

Attention where it matters: Rethinking visual document understanding with selective region concentration‏

H Cao, C Bao, C Liu, H Chen, K Yin… - Proceedings of the …, 2023‏ - openaccess.thecvf.com‏

We propose a novel end-to-end document understanding model called SeRum (SElective
Region Understanding Model) for extracting meaningful information from document images …‏

שמור צטט צוטט על ידי 15 מאמרים בנושא זה כל 5 הגרסאות פתיחה בתור HTML

[Free GPT-4]
[DeepSeek]

[PDF] ucl.ac.uk

You can even annotate text with voice: Transcription-only-supervised text spotting‏

J Tang, S Qiao, B Cui, Y Ma, S Zhang… - Proceedings of the 30th …, 2022‏ - dl.acm.org‏

End-to-end scene text spotting has recently gained great attention in the research
community. The majority of existing methods rely heavily on the location annotations of text …‏

שמור צטט צוטט על ידי 25 מאמרים בנושא זה כל 3 הגרסאות

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Filling in the blank: Rationale-augmented prompt tuning for TextVQA‏

G Zeng, Y Zhang, Y Zhou, B Fang, G Zhao… - Proceedings of the 31st …, 2023‏ - dl.acm.org‏

Recently, generative Text-based visual question answering (TextVQA) methods, which are
often based on language models, have exhibited impressive results and drawn increasing …‏

שמור צטט צוטט על ידי 8 מאמרים בנושא זה

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

ICDAR 2023 competition on structured text extraction from visually-rich document images‏

W Yu, C Zhang, H Cao, W Hua, B Li, H Chen… - … on Document Analysis …, 2023‏ - Springer‏

Structured text extraction is one of the most valuable and challenging application directions
in the field of Document AI. However, the scenarios of past benchmarks are limited, and the …‏

שמור צטט צוטט על ידי 8 מאמרים בנושא זה כל 5 הגרסאות

Query-driven generative network for document information extraction in the wild‏

H Cao, X Li, J Ma, D Jiang, A Guo, Y Hu, H Liu… - Proceedings of the 30th …, 2022‏ - dl.acm.org‏

This paper focuses on solving Document Information Extraction (DIE) in the wild problem,
which is rarely explored before. In contrast to existing studies mainly tailored for document …‏

שמור צטט צוטט על ידי 11 מאמרים בנושא זה

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Lapdoc: Layout-aware prompting for documents‏

M Lamott, YN Weweler, A Ulges, F Shafait… - … on Document Analysis …, 2024‏ - Springer‏

Recent advances in training large language models (LLMs) using massive amounts of
solely textual data lead to strong generalization across many domains and tasks, including …‏

שמור צטט צוטט על ידי 5 מאמרים בנושא זה כל 6 הגרסאות

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Deep Learning based Key Information Extraction from Business Documents: Systematic Literature Review‏

A Rombach, P Fettke - arxiv preprint arxiv:2408.06345, 2024‏ - arxiv.org‏

Extracting key information from documents represents a large portion of business workloads
and therefore offers a high potential for efficiency improvements and process automation …‏

שמור צטט מאמרים בנושא זה כל 2 הגרסאות פתיחה בתור HTML

[Free GPT-4]
[DeepSeek]

[PDF] cips-cl.org

Document information extraction via global tagging‏

S He, T Wang, Y Lu, H Lin, X Han, Y Sun… - … National Conference on …, 2023‏ - Springer‏

Abstract Document Information Extraction (DIE) is a crucial task for extracting key information
from visually-rich documents. The typical pipeline approach for this task involves Optical …‏

שמור צטט צוטט על ידי 3 מאמרים בנושא זה כל 2 הגרסאות

GenTC: Generative Transformer via Contrastive Learning for Receipt Information Extraction‏

X Deng, Z Huang, K Ma, K Chen, J Guo… - … Conference on Artificial …, 2023‏ - Springer‏

Abstract Information Extraction from visually rich documents has attracted increasing
attention due to its various advanced applications in the real world. Most existing methods …‏

שמור צטט צוטט על ידי 1 מאמרים בנושא זה כל 2 הגרסאות

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

First-place Solution for Streetscape Shop Sign Recognition Competition‏

B Wang, L **g - arxiv preprint arxiv:2501.02811, 2025‏ - arxiv.org‏

Text recognition technology applied to street-view storefront signs is increasingly utilized
across various practical domains, including map navigation, smart city planning analysis …‏

שמור צטט מאמרים בנושא זה כל 2 הגרסאות פתיחה בתור HTML

יצירת התראה

צטט

חיפוש מתקדם

נשמר בספרייה שלי

GMN: generative multi-modal network for practical document information extraction

Attention where it matters: Rethinking visual document understanding with selective region concentration‏

You can even annotate text with voice: Transcription-only-supervised text spotting‏

Filling in the blank: Rationale-augmented prompt tuning for TextVQA‏

ICDAR 2023 competition on structured text extraction from visually-rich document images‏

Query-driven generative network for document information extraction in the wild‏

Lapdoc: Layout-aware prompting for documents‏

Deep Learning based Key Information Extraction from Business Documents: Systematic Literature Review‏

Document information extraction via global tagging‏

GenTC: Generative Transformer via Contrastive Learning for Receipt Information Extraction‏

First-place Solution for Streetscape Shop Sign Recognition Competition‏