Omniparser: A unified framework for text spotting key information extraction and table recognition

J Wan, S Song, W Yu, Y Liu, W Cheng… - Proceedings of the …, 2024 - openaccess.thecvf.com
Recently visually-situated text parsing (VsTP) has experienced notable advancements
driven by the increasing demand for automated document understanding and the …

Estextspotter: Towards better scene text spotting with explicit synergy in transformer

M Huang, J Zhang, D Peng, H Lu… - Proceedings of the …, 2023 - openaccess.thecvf.com
In recent years, end-to-end scene text spotting approaches are evolving to the Transformer-
based framework. While previous studies have shown the crucial importance of the intrinsic …

Towards robust real-time scene text detection: From semantic to instance representation learning

X Qin, P Lyu, C Zhang, Y Zhou, K Yao… - Proceedings of the 31st …, 2023 - dl.acm.org
Due to the flexible representation of arbitrary-shaped scene text and simple pipeline, bottom-
up segmentation-based methods begin to be mainstream in real-time scene text detection …

Beyond OCR+ VQA: Towards end-to-end reading and reasoning for robust and accurate textvqa

G Zeng, Y Zhang, Y Zhou, X Yang, N Jiang, G Zhao… - Pattern Recognition, 2023 - Elsevier
Text-based visual question answering (TextVQA), which answers a visual question by
considering both visual contents and scene texts, has attracted increasing attention recently …

Visual Text Meets Low-level Vision: A Comprehensive Survey on Visual Text Processing

Y Shu, W Zeng, Z Li, F Zhao, Y Zhou - arxiv preprint arxiv:2402.03082, 2024 - arxiv.org
Visual text, a pivotal element in both document and scene images, speaks volumes and
attracts significant attention in the computer vision domain. Beyond visual text detection and …

[PDF][PDF] Divide Rows and Conquer Cells: Towards Structure Recognition for Large Tables.

H Shen, X Gao, J Wei, L Qiao, Y Zhou, Q Li, Z Cheng - IJCAI, 2023 - researchgate.net
Abstract Recent advanced Table Structure Recognition (TSR) models adopt image-to-text
solutions to parse table structure. These methods can be formulated as image caption …

DNTextSpotter: Arbitrary-shaped scene text spotting via improved denoising training

Q Qiao, Y **e, J Gao, T Wu, S Huang, J Fan… - Proceedings of the …, 2024 - dl.acm.org
More and more end-to-end text spotting methods based on Transformer architecture have
demonstrated superior performance. These methods utilize a bipartite graph matching …

Perceiving ambiguity and semantics without recognition: an efficient and effective ambiguous scene text detector

Y Shu, W Wang, Y Zhou, S Liu, A Zhang… - Proceedings of the 31st …, 2023 - dl.acm.org
Ambiguous scene text detection is an extremely challenging task. Existing text detectors that
rely solely on visual cues often suffer from confusion due to being evenly distributed in …

Inverse-like antagonistic scene text spotting via reading-order estimation and dynamic sampling

SX Zhang, C Yang, X Zhu, H Zhou… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Scene text spotting is a challenging task, especially for inverse-like scene text, which has
complex layouts, eg, mirrored, symmetrical, or retro-flexed. In this paper, we propose a …

First creating backgrounds then rendering texts: A new paradigm for visual text blending

Z Li, Y Shu, W Zeng, D Yang, Y Zhou - arxiv preprint arxiv:2410.10168, 2024 - arxiv.org
Diffusion models, known for their impressive image generation abilities, have played a
pivotal role in the rise of visual text generation. Nevertheless, existing visual text generation …