Robust document image binarization technique for degraded document images
Segmentation of text from badly degraded document images is a very challenging task due
to the high inter/intra-variation between the document background and the foreground text of …
to the high inter/intra-variation between the document background and the foreground text of …
Camera-based analysis of text and documents: a survey
J Liang, D Doermann, H Li - International Journal of Document Analysis …, 2005 - Springer
The increasing availability of high-performance, low-priced, portable digital imaging devices
has created a tremendous opportunity for supplementing traditional scanning for document …
has created a tremendous opportunity for supplementing traditional scanning for document …
Progress in camera-based document image analysis
D Doermann, J Liang, H Li - Seventh International Conference …, 2003 - ieeexplore.ieee.org
The increasing availability of high performance, low priced, portable digital imaging devices
has created a tremendous opportunity for supplementing traditional scanning for document …
has created a tremendous opportunity for supplementing traditional scanning for document …
OCR binarization and image pre-processing for searching historical documents
MR Gupta, NP Jacobson, EK Garcia - Pattern Recognition, 2007 - Elsevier
We consider the problem of document binarization as a pre-processing step for optical
character recognition (OCR) for the purpose of keyword search of historical printed …
character recognition (OCR) for the purpose of keyword search of historical printed …
Using latent dirichlet allocation for automatic categorization of software
In this paper, we propose a technique called LACT for automatically categorizing software
systems in open-source repositories. LACT is based on latent Dirichlet Allocation, an …
systems in open-source repositories. LACT is based on latent Dirichlet Allocation, an …
Machine printed text and handwriting identification in noisy document images
In this paper, we address the problem of the identification of text in noisy document images.
We are especially focused on segmenting and identifying between handwriting and …
We are especially focused on segmenting and identifying between handwriting and …
Scene text understanding: recapitulating the past decade
Computational perception has indeed been dramatically modified and reformed from
handcrafted feature-based techniques to the advent of deep learning. Scene text …
handcrafted feature-based techniques to the advent of deep learning. Scene text …
An MRF model for binarization of natural scene text
Inspired by the success of MRF models for solving object segmentation problems, we
formulate the binarization problem in this framework. We represent the pixels in a document …
formulate the binarization problem in this framework. We represent the pixels in a document …
An improved parallel thinning algorithm
J Dong, W Lin, C Huang - 2016 International Conference on …, 2016 - ieeexplore.ieee.org
Thinning algorithms often cause stroke distortions at the crosses or intersections of strokes,
which lead to bad results in pattern recognition tasks. In order to overcome these drawbacks …
which lead to bad results in pattern recognition tasks. In order to overcome these drawbacks …
Image binarization for end-to-end text understanding in natural images
While modern off-the-shelf OCR engines show particularly high accuracy on scanned text,
text detection and recognition in natural images still remains a challenging problem. Here …
text detection and recognition in natural images still remains a challenging problem. Here …