Degraded historical document binarization: A review on issues, challenges, techniques, and future directions

A Sulaiman, K Omar, MF Nasrudin - Journal of Imaging, 2019 - mdpi.com
In this era of digitization, most hardcopy documents are being transformed into digital
formats. In the process of transformation, large quantities of documents are stored and …

Historical document image binarization: A review

C Tensmeyer, T Martinez - SN Computer Science, 2020 - Springer
This review provides a comprehensive view of the field of historical document image
binarization with a focus on the contributions made in the last decade. After the introduction …

Document image binarization with fully convolutional neural networks

C Tensmeyer, T Martinez - 2017 14th IAPR international …, 2017 - ieeexplore.ieee.org
Binarization of degraded historical manuscript images is an important pre-processing step
for many document processing tasks. We formulate binarization as a pixel classification …

Docentr: An end-to-end document image enhancement transformer

MA Souibgui, S Biswas, SK Jemni… - 2022 26th …, 2022 - ieeexplore.ieee.org
Document images can be affected by many degradation scenarios, which cause recognition
and processing difficulties. In this age of digitization, it is important to denoise them for …

Enhance to read better: a multi-task adversarial network for handwritten document image enhancement

SK Jemni, MA Souibgui, Y Kessentini, A Fornés - Pattern Recognition, 2022 - Elsevier
Handwritten document images can be highly affected by degradation for different reasons:
Paper ageing, daily-life scenarios (wrinkles, dust, etc.), bad scanning process and so on …

A selectional auto-encoder approach for document image binarization

J Calvo-Zaragoza, AJ Gallego - Pattern Recognition, 2019 - Elsevier
Binarization plays a key role in the automatic information retrieval from document images.
This process is usually performed in the first stages of document analysis systems, and …

A survey of historical document image datasets

K Nikolaidou, M Seuret, H Mokayed… - International Journal on …, 2022 - Springer
This paper presents a systematic literature review of image datasets for document image
analysis, focusing on historical documents, such as handwritten manuscripts and early …

Autopart: Automating schema design for large scientific databases using data partitioning

S Papadomanolakis, A Ailamaki - … International Conference on …, 2004 - ieeexplore.ieee.org
Database applications that use multi-terabyte datasets are becoming increasingly important
for scientific fields such as astronomy and biology. Scientific databases are particularly …

Text-DIAE: a self-supervised degradation invariant autoencoder for text recognition and document enhancement

MA Souibgui, S Biswas, A Mafla, AF Biten… - proceedings of the …, 2023 - ojs.aaai.org
In this paper, we propose a Text-Degradation Invariant Auto Encoder (Text-DIAE), a self-
supervised model designed to tackle two tasks, text recognition (handwritten or scene-text) …

Generate, transform, and clean: the role of GANs and transformers in palm leaf manuscript generation and enhancement

N Thuon, J Du, Z Zhang, J Ma, P Hu - International Journal on Document …, 2024 - Springer
Palm leaf manuscripts offer a rich source of data critical for document analysis tasks,
including character, word, and text analysis. However, their cleaning and denoising present …