Robust document image binarization technique for degraded document images

B Su, S Lu, CL Tan - IEEE transactions on image processing, 2012 - ieeexplore.ieee.org
Segmentation of text from badly degraded document images is a very challenging task due
to the high inter/intra-variation between the document background and the foreground text of …

Camera-based analysis of text and documents: a survey

J Liang, D Doermann, H Li - International Journal of Document Analysis …, 2005 - Springer
The increasing availability of high-performance, low-priced, portable digital imaging devices
has created a tremendous opportunity for supplementing traditional scanning for document …

Progress in camera-based document image analysis

D Doermann, J Liang, H Li - Seventh International Conference …, 2003 - ieeexplore.ieee.org
The increasing availability of high performance, low priced, portable digital imaging devices
has created a tremendous opportunity for supplementing traditional scanning for document …

OCR binarization and image pre-processing for searching historical documents

MR Gupta, NP Jacobson, EK Garcia - Pattern Recognition, 2007 - Elsevier
We consider the problem of document binarization as a pre-processing step for optical
character recognition (OCR) for the purpose of keyword search of historical printed …

Using latent dirichlet allocation for automatic categorization of software

K Tian, M Revelle, D Poshyvanyk - 2009 6th IEEE international …, 2009 - ieeexplore.ieee.org
In this paper, we propose a technique called LACT for automatically categorizing software
systems in open-source repositories. LACT is based on latent Dirichlet Allocation, an …

Machine printed text and handwriting identification in noisy document images

Y Zheng, H Li, D Doermann - IEEE transactions on pattern …, 2004 - ieeexplore.ieee.org
In this paper, we address the problem of the identification of text in noisy document images.
We are especially focused on segmenting and identifying between handwriting and …

Scene text understanding: recapitulating the past decade

M Ghosh, H Mukherjee, SM Obaidullah, XZ Gao… - Artificial Intelligence …, 2023 - Springer
Computational perception has indeed been dramatically modified and reformed from
handcrafted feature-based techniques to the advent of deep learning. Scene text …

An MRF model for binarization of natural scene text

A Mishra, K Alahari, CV Jawahar - … International Conference on …, 2011 - ieeexplore.ieee.org
Inspired by the success of MRF models for solving object segmentation problems, we
formulate the binarization problem in this framework. We represent the pixels in a document …

An improved parallel thinning algorithm

J Dong, W Lin, C Huang - 2016 International Conference on …, 2016 - ieeexplore.ieee.org
Thinning algorithms often cause stroke distortions at the crosses or intersections of strokes,
which lead to bad results in pattern recognition tasks. In order to overcome these drawbacks …

Image binarization for end-to-end text understanding in natural images

S Milyaev, O Barinova, T Novikova… - 2013 12th …, 2013 - ieeexplore.ieee.org
While modern off-the-shelf OCR engines show particularly high accuracy on scanned text,
text detection and recognition in natural images still remains a challenging problem. Here …