A survey of OCR in Arabic language: applications, techniques, and challenges

S Faizullah, MS Ayub, S Hussain, MA Khan - Applied Sciences, 2023 - mdpi.com
Optical character recognition (OCR) is the process of extracting handwritten or printed text
from a scanned or printed image and converting it to a machine-readable form for further …

Causal reasoning meets visual representation learning: A prospective study

Y Liu, YS Wei, H Yan, GB Li, L Lin - Machine Intelligence Research, 2022 - Springer
Visual representation learning is ubiquitous in various real-world applications, including
visual comprehension, video understanding, multi-modal analysis, human-computer …

Scene text recognition with permuted autoregressive sequence models

D Bautista, R Atienza - European conference on computer vision, 2022 - Springer
Context-aware STR methods typically use internal autoregressive (AR) language models
(LM). Inherent limitations of AR models motivated two-stage methods which employ an …

Deepsolo: Let transformer decoder with explicit points solo for text spotting

M Ye, J Zhang, S Zhao, J Liu, T Liu… - Proceedings of the …, 2023 - openaccess.thecvf.com
End-to-end text spotting aims to integrate scene text detection and recognition into a unified
framework. Dealing with the relationship between the two sub-tasks plays a pivotal role in …

Vision transformer for fast and efficient scene text recognition

R Atienza - International conference on document analysis and …, 2021 - Springer
Scene text recognition (STR) enables computers to read text in natural scenes such as
object labels, road signs and instructions. STR helps machines perform informed decisions …

Multi-granularity prediction for scene text recognition

P Wang, C Da, C Yao - European Conference on Computer Vision, 2022 - Springer
Scene text recognition (STR) has been an active research topic in computer vision for years.
To tackle this challenging problem, numerous innovative methods have been successively …

Sequence-to-sequence contrastive learning for text recognition

A Aberdam, R Litman, S Tsiper… - Proceedings of the …, 2021 - openaccess.thecvf.com
We propose a framework for sequence-to-sequence contrastive learning (SeqCLR) of visual
representations, which we apply to text recognition. To account for the sequence-to …

Conditional text image generation with diffusion models

Y Zhu, Z Li, T Wang, M He… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Current text recognition systems, including those for handwritten scripts and scene text, have
relied heavily on image synthesis and augmentation, since it is difficult to realize real-world …

Towards weakly-supervised text spotting using a multi-task transformer

Y Kittenplon, I Lavi, S Fogel, Y Bar… - Proceedings of the …, 2022 - openaccess.thecvf.com
Text spotting end-to-end methods have recently gained attention in the literature due to the
benefits of jointly optimizing the text detection and recognition components. Existing …

CLIP4STR: a simple baseline for scene text recognition with pre-trained vision-language model

S Zhao, R Quan, L Zhu, Y Yang - IEEE Transactions on Image …, 2024 - ieeexplore.ieee.org
Pre-trained vision-language models (VLMs) are the de-facto foundation models for various
downstream tasks. However, scene text recognition methods still prefer backbones pre …