Omniparser: A unified framework for text spotting key information extraction and table recognition

J Wan, S Song, W Yu, Y Liu, W Cheng… - Proceedings of the …, 2024 - openaccess.thecvf.com
Recently visually-situated text parsing (VsTP) has experienced notable advancements
driven by the increasing demand for automated document understanding and the …

Odm: A text-image further alignment pre-training approach for scene text detection and spotting

C Duan, P Fu, S Guo, Q Jiang… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
In recent years text-image joint pre-training techniques have shown promising results in
various tasks. However in Optical Character Recognition (OCR) tasks aligning text instances …

Lane2seq: towards unified lane detection via sequence generation

K Zhou - Proceedings of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
In this paper we present a novel sequence generation-based framework for lane detection
called Lane2Seq. It unifies various lane detection formats by casting lane detection as a …

Platypus: A generalized specialist model for reading text in various forms

P Wang, Z Li, J Tang, H Zhong, F Huang… - … on Computer Vision, 2024 - Springer
Reading text from images (either natural scenes or documents) has been a long-standing
research topic for decades, due to the high technical challenge and wide application range …

DNTextSpotter: Arbitrary-shaped scene text spotting via improved denoising training

Q Qiao, Y **e, J Gao, T Wu, S Huang, J Fan… - Proceedings of the …, 2024 - dl.acm.org
More and more end-to-end text spotting methods based on Transformer architecture have
demonstrated superior performance. These methods utilize a bipartite graph matching …

Hyper-local deformable transformers for text spotting on historical maps

Y Lin, YY Chiang - Proceedings of the 30th ACM SIGKDD Conference …, 2024 - dl.acm.org
Text on historical maps contains valuable information providing georeferenced historical,
political, and cultural contexts. However, text extraction from historical maps has been …

Hierarchical text spotter for joint text spotting and layout analysis

S Long, S Qin, Y Fujii, A Bissacco… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract We propose Hierarchical Text Spotter (HTS), a novel method for the joint task of
word-level text spotting and geometric layout analysis. HTS can recognize text in an image …

Bridging the Gap Between End-to-End and Two-Step Text Spotting

M Huang, H Li, Y Liu, X Bai… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Modularity plays a crucial role in the development and maintenance of complex systems.
While end-to-end text spotting efficiently mitigates the issues of error accumulation and sub …

SCOB: Universal Text Understanding via Character-wise Supervised Contrastive Learning with Online Text Rendering for Bridging Domain Gap

D Kim, Y Kim, DH Kim, Y Lim… - Proceedings of the …, 2023 - openaccess.thecvf.com
Inspired by the great success of language model (LM)-based pre-training, recent studies in
visual document understanding have explored LM-based pre-training methods for modeling …

A Mixed-Precision Transformer Accelerator With Vector Tiling Systolic Array for License Plate Recognition in Unconstrained Scenarios

J Li, D Yan, F He, Z Dong… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Power efficiency for license plate recognition (LPR) under unconstrained scenarios is a
crucial factor in many edge-based real-world applications, eg, autonomous vehicles whose …