Survey of post-OCR processing approaches

TTH Nguyen, A Jatowt, M Coustaty… - ACM Computing Surveys …, 2021 - dl.acm.org
Optical character recognition (OCR) is one of the most popular techniques used for
converting printed documents into machine-readable ones. While OCR engines can do well …

Randomized trees for human pose detection

G Rogez, J Rihan, S Ramalingam… - … IEEE Conference on …, 2008 - ieeexplore.ieee.org
This paper addresses human pose recognition from video sequences by formulating it as a
classification problem. Unlike much previous work we do not make any assumptions on the …

Progressive alignment and discriminative error correction for multiple OCR engines

WB Lund, DD Walker… - … Conference on Document …, 2011 - ieeexplore.ieee.org
This paper presents a novel method for improving optical character recognition (OCR). The
method employs the progressive alignment of hypotheses from multiple OCR engines …

Synthetically Augmented Self-Supervised Fine-Tuning for Diverse Text OCR Correction

S Guan, D Greene - ECAI 2024, 2024 - ebooks.iospress.nl
Abstract The adoption of Optical Character Recognition (OCR) tools has been central to the
increased digitization of historical documents. However, the errors introduced during OCR …

How well does multiple OCR error correction generalize?

WB Lund, EK Ringger… - Document Recognition and …, 2014 - spiedigitallibrary.org
As the digitization of historical documents, such as newspapers, becomes more common,
the need of the archive patron for accurate digital text from those documents increases …

[PDF][PDF] On automating editions: the affordances of handwritten text recognition platforms for scholarly editing

M Terras, J Nockels, P Gooding - Scholarly Editing, 2023 - eprints.gla.ac.uk
How can scholarly editors best make use of recent developments in Handwritten Text
Recognition (HTR), where the products of automated text recognition can appear as …

Why multiple document image binarizations improve OCR

WB Lund, DJ Kennard, EK Ringger - Proceedings of the 2nd …, 2013 - dl.acm.org
Our previous work has shown that the error correction of optical character recognition (OCR)
on degraded historical machine-printed documents is improved with the use of multiple …

Selection technique for multiple outputs of optical character recognition

IQ Habeeb, ZQ Al-Zaydi… - Eurasian Journal of …, 2020 - mathnet.ru
The approach of OCR multiple outputs is used to improve accuracy for low scanning
resolution images. The idea of this approach is to incorporate information from multiple …

Enhanced ensemble technique for optical character recognition

IQ Habeeb, ZQ Al-Zaydi, HN Abdulkhudhur - International Conference on …, 2018 - Springer
Optical character recognition (OCR) is the electronic transformation of images into a
computer-encoded text. OCR systems often produce poor accuracy for noisy images …

[BOOK][B] Ensemble Methods for Historical Machine-Printed Document Recognition

WB Lund - 2014 - search.proquest.com
The usefulness of digitized documents is directly related to the quality of the extracted text.
Optical Character Recognition (OCR) has reached a point where well-formatted and clean …