Handwritten optical character recognition (OCR): A comprehensive systematic literature review (SLR)

J Memon, M Sami, RA Khan, M Uddin - IEEE access, 2020 - ieeexplore.ieee.org
Given the ubiquity of handwritten documents in human transactions, Optical Character
Recognition (OCR) of documents have invaluable practical worth. Optical character …

Text recognition in the wild: A survey

X Chen, L **, Y Zhu, C Luo, T Wang - ACM Computing Surveys (CSUR), 2021 - dl.acm.org
The history of text can be traced back over thousands of years. Rich and precise semantic
information carried by text is important in a wide range of vision-based application …

Real-time scene text detection with differentiable binarization and adaptive scale fusion

M Liao, Z Zou, Z Wan, C Yao… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
Recently, segmentation-based scene text detection methods have drawn extensive attention
in the scene text detection field, because of their superiority in detecting the text instances of …

Scene text recognition with permuted autoregressive sequence models

D Bautista, R Atienza - European conference on computer vision, 2022 - Springer
Context-aware STR methods typically use internal autoregressive (AR) language models
(LM). Inherent limitations of AR models motivated two-stage methods which employ an …

Read like humans: Autonomous, bidirectional and iterative language modeling for scene text recognition

S Fang, H **e, Y Wang, Z Mao… - Proceedings of the …, 2021 - openaccess.thecvf.com
Linguistic knowledge is of great benefit to scene text recognition. However, how to effectively
model linguistic rules in end-to-end deep networks remains a research challenge. In this …

Conditional text image generation with diffusion models

Y Zhu, Z Li, T Wang, M He… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Current text recognition systems, including those for handwritten scripts and scene text, have
relied heavily on image synthesis and augmentation, since it is difficult to realize real-world …

Turning a clip model into a scene text detector

W Yu, Y Liu, W Hua, D Jiang… - Proceedings of the …, 2023 - openaccess.thecvf.com
The recent large-scale Contrastive Language-Image Pretraining (CLIP) model has shown
great potential in various downstream tasks via leveraging the pretrained vision and …

Deepsolo: Let transformer decoder with explicit points solo for text spotting

M Ye, J Zhang, S Zhao, J Liu, T Liu… - Proceedings of the …, 2023 - openaccess.thecvf.com
End-to-end text spotting aims to integrate scene text detection and recognition into a unified
framework. Dealing with the relationship between the two sub-tasks plays a pivotal role in …

Geolayoutlm: Geometric pre-training for visual information extraction

C Luo, C Cheng, Q Zheng… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Visual information extraction (VIE) plays an important role in Document Intelligence.
Generally, it is divided into two tasks: semantic entity recognition (SER) and relation …

Towards end-to-end unified scene text detection and layout analysis

S Long, S Qin, D Panteleev… - Proceedings of the …, 2022 - openaccess.thecvf.com
Scene text detection and document layout analysis have long been treated as two separate
tasks in different image domains. In this paper, we bring them together and introduce the …