Transformer for handwritten text recognition using bidirectional post-decoding

C Wick, J Zöllner, T Grüning - International Conference on Document …, 2021 - Springer
Most recently, Transformers–which are recurrent-free neural network architectures–
achieved tremendous performances on various Natural Language Processing (NLP) tasks …

Natural language processing for cultural heritage domains

C Sporleder - Language and Linguistics Compass, 2010 - Wiley Online Library
Museums, archives, libraries and other cultural heritage institutes maintain large collections
of artefacts, which are valuable knowledge sources for both experts and interested lay …

A survey of text alignment visualization

T Yousef, S Janicke - IEEE transactions on visualization and …, 2020 - ieeexplore.ieee.org
Text alignment is one of the fundamental techniques text-related domains like natural
language processing, computational linguistics, and digital humanities. It compares two or …

An OCR post-correction approach using deep learning for processing medical reports

S Karthikeyan, AGS de Herrera… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
According to a recent Deloitte study, the COVID-19 pandemic continues to place a huge
strain on the global health care sector. Covid-19 has also catalysed digital transformation …

OCR post-correction for detecting adversarial text images

NH Imam, VG Vassilakis, D Kolovos - Journal of Information Security and …, 2022 - Elsevier
The amount of images with embedded text shared on Online Social Networks (OSNs), such
as Twitter or Facebook has been growing in recent years. It is becoming important to …

Optical character recognition of 19th century classical commentaries: the current state of affairs

M Romanello, S Najem-Meyer… - Proceedings of the 6th …, 2021 - dl.acm.org
Together with critical editions and translations, commentaries are one of the main genres of
publication in literary and textual scholarship, and have a century-long tradition. Yet, the …

Multi-input attention for unsupervised OCR correction

R Dong, DA Smith - Proceedings of the 56th Annual Meeting of …, 2018 - aclanthology.org
We propose a novel approach to OCR post-correction that exploits repeated texts in large
corpora both as a source of noisy target outputs for unsupervised training and as a source of …

A fast alignment scheme for automatic ocr evaluation of books

IZ Yalniz, R Manmatha - 2011 International Conference on …, 2011 - ieeexplore.ieee.org
This paper aims to evaluate the accuracy of optical character recognition (OCR) systems on
real scanned books. The ground truth e-texts are obtained from the Project Gutenberg …

PNRank: Unsupervised ranking of person name entities from noisy OCR text

H Dutta, A Gupta - Decision Support Systems, 2022 - Elsevier
Text databases have grown tremendously in number, size, and volume over the last few
decades. Optical Character Recognition (OCR) software is used to scan the text and make …

Improving OCR accuracy on early printed books by utilizing cross fold training and voting

C Reul, U Springmann, C Wick… - 2018 13th IAPR …, 2018 - ieeexplore.ieee.org
In this paper we introduce a method that significantly reduces the character error rates for
OCR text obtained from OCRopus models trained on early printed books. The method uses …