- Academic Search

TTH Nguyen, A Jatowt, M Coustaty… - ACM Computing Surveys …, 2021 - dl.acm.org

Optical character recognition (OCR) is one of the most popular techniques used for
converting printed documents into machine-readable ones. While OCR engines can do well …

Save Cite Cited by 175 Related articles All 4 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] mdpi.com

Survey of automatic spelling correction

D Hládek, J Staš, M Pleva - Electronics, 2020 - mdpi.com

Automatic spelling correction has been receiving sustained research attention. Although
each article contains a brief introduction to the topic, there is a lack of work that would …

Save Cite Cited by 121 Related articles All 6 versions Free GPT-4 DeepSeek Cached

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Parallel iterative edit models for local sequence transduction

A Awasthi, S Sarawagi, R Goyal, S Ghosh… - arxiv preprint arxiv …, 2019 - arxiv.org

We present a Parallel Iterative Edit (PIE) model for the problem of local sequence
transduction arising in tasks like Grammatical error correction (GEC). Recent approaches …

Save Cite Cited by 187 Related articles All 6 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Optical character recognition with neural networks and post-correction with finite state methods

S Drobac, K Lindén - International Journal on Document Analysis and …, 2020 - Springer

The optical character recognition (OCR) quality of the historical part of the Finnish
newspaper and journal corpus is rather low for reliable search and scientific research on the …

Save Cite Cited by 79 Related articles All 8 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] uzh.ch

Supervised OCR error detection and correction using statistical and neural machine translation methods

C Amrhein, S Clematide - Journal for Language Technology and …, 2018 - zora.uzh.ch

For indexing the content of digitized historical texts, optical character recognition (OCR)
errors are a hampering problem. To explore the effectivity of new strategies for OCR post …

Save Cite Cited by 69 Related articles All 4 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] mit.edu

Neural OCR post-hoc correction of historical corpora

L Lyu, M Koutraki, M Krickl, B Fetahu - Transactions of the Association …, 2021 - direct.mit.edu

Optical character recognition (OCR) is crucial for a deeper access to historical collections.
OCR needs to account for orthographic variations, typefaces, or language evolution (ie, new …

Save Cite Cited by 32 Related articles All 11 versions Free GPT-4 DeepSeek

Social media text normalization for Turkish

G ERYİǦİT… - Natural Language …, 2017 - cambridge.org

Text normalization is an indispensable stage in processing noncanonical language from
natural sources, such as speech, social media or short text messages. Research in this field …

Save Cite Cited by 54 Related articles All 6 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Old content and modern tools-searching named entities in a Finnish OCRed historical newspaper collection 1771-1910

K Kettunen, E Mäkelä, T Ruokolainen… - arxiv preprint arxiv …, 2016 - arxiv.org

Named Entity Recognition (NER), search, classification and tagging of names and name like
frequent informational elements in texts, has become a standard information extraction …

Save Cite Cited by 41 Related articles All 13 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] helsinki.fi

OCR and post-correction of historical Finnish texts

S Drobac, PS Kauppinen… - Nordic Conference of …, 2017 - researchportal.helsinki.fi

This paper presents experiments on Optical character recognition (OCR) as a combination
of Ocropy software and data-driven spelling correction that uses Weighted Finite-State …

Save Cite Cited by 31 Related articles All 9 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] ed.ac.uk

Local string transduction as sequence labeling

J Ribeiro, S Narayan, S Cohen… - 27th International …, 2018 - research.ed.ac.uk

We show that the general problem of string transduction can be reduced to the problem of
sequence labeling. While character deletions and insertions are allowed in string …

Save Cite Cited by 21 Related articles All 3 versions Free GPT-4 DeepSeek View as HTML

Create alert

Cite

Advanced search

Saved to My library

Data-driven spelling correction using weighted finite-state methods

Survey of post-OCR processing approaches

Survey of automatic spelling correction

Parallel iterative edit models for local sequence transduction

Optical character recognition with neural networks and post-correction with finite state methods

Supervised OCR error detection and correction using statistical and neural machine translation methods

Neural OCR post-hoc correction of historical corpora

Social media text normalization for Turkish

Old content and modern tools-searching named entities in a Finnish OCRed historical newspaper collection 1771-1910

OCR and post-correction of historical Finnish texts

Local string transduction as sequence labeling