A survey of document image word spotting techniques

AP Giotis, G Sfikas, B Gatos, C Nikou - Pattern Recognition, 2017 - Elsevier
Vast collections of documents available in image format need to be indexed for information
retrieval purposes. In this framework, word spotting is an alternative solution to optical …

Degraded historical document binarization: A review on issues, challenges, techniques, and future directions

A Sulaiman, K Omar, MF Nasrudin - Journal of Imaging, 2019 - mdpi.com
In this era of digitization, most hardcopy documents are being transformed into digital
formats. In the process of transformation, large quantities of documents are stored and …

Robustness of Structured Data Extraction from In-Plane Rotated Documents Using Multi-Modal Large Language Models (LLM)

A Biswas, W Talukdar - Journal of Artificial Intelligence Research, 2024 - arxiv.org
Multi-modal large language models (LLMs) have shown remarkable performance in various
natural language processing tasks, including data extraction from documents. However, the …

A new local adaptive thresholding technique in binarization

TR Singh, S Roy, OI Singh, T Sinam… - arxiv preprint arxiv …, 2012 - arxiv.org
Image binarization is the process of separation of pixel values into two groups, white as
background and black as foreground. Thresholding plays a major in binarization of images …

Robust document image binarization technique for degraded document images

B Su, S Lu, CL Tan - IEEE transactions on image processing, 2012 - ieeexplore.ieee.org
Segmentation of text from badly degraded document images is a very challenging task due
to the high inter/intra-variation between the document background and the foreground text of …

Comparative analysis of image binarization methods for crack identification in concrete structures

H Kim, E Ahn, S Cho, M Shin, SH Sim - Cement and Concrete Research, 2017 - Elsevier
Surface cracks in concrete structures are critical indicators of structural damage and
durability. Manual visual inspection, the most commonly used method in practice, is …

DeepOtsu: Document enhancement and binarization using iterative deep learning

S He, L Schomaker - Pattern recognition, 2019 - Elsevier
This paper presents a novel iterative deep learning framework and applies it to document
enhancement and binarization. Unlike the traditional methods that predict the binary label of …

Binarization of historical document images using the local maximum and minimum

B Su, S Lu, CL Tan - Proceedings of the 9th IAPR International …, 2010 - dl.acm.org
This paper presents a new document image binarization technique that segments the text
from badly degraded historical document images. The proposed technique makes use of the …

AdOtsu: An adaptive and parameterless generalization of Otsu's method for document image binarization

RF Moghaddam, M Cheriet - Pattern Recognition, 2012 - Elsevier
Adaptive binarization methods play a central role in document image processing. In this
work, an adaptive and parameterless generalization of Otsu's method is presented. The …

Document image binarization using background estimation and stroke edges

S Lu, B Su, CL Tan - International Journal on Document Analysis and …, 2010 - Springer
Document images often suffer from different types of degradation that renders the document
image binarization a challenging task. This paper presents a document image binarization …