Text line and word segmentation of handwritten documents
In this paper, we present a segmentation methodology of handwritten documents in their
distinct entities, namely, text lines and words. Text line segmentation is achieved by applying …
distinct entities, namely, text lines and words. Text line segmentation is achieved by applying …
Printed Ottoman text recognition using synthetic data and data augmentation
EF Bilgin Tasdemir - International Journal on Document Analysis and …, 2023 - Springer
The Ottoman script, which was in use for over five centuries, is an Arabic alphabet-based
writing system. It became obsolete after the change of alphabet in Turkey. There are plenty …
writing system. It became obsolete after the change of alphabet in Turkey. There are plenty …
Matching word images for content-based retrieval from printed document images
As large quantity of document images is getting archived by the digital libraries, there is a
need for an efficient search strategies to make them available as per users information need …
need for an efficient search strategies to make them available as per users information need …
SHIBR—The Swedish historical birth records: A semi-annotated dataset
This paper presents a digital image dataset of historical handwritten birth records stored in
the archives of several parishes across Sweden, together with the corresponding metadata …
the archives of several parishes across Sweden, together with the corresponding metadata …
GAN-based text line segmentation method for challenging handwritten documents
Text line segmentation (TLS) is an essential step of the end-to-end document analysis
systems. The main purpose of this step is to extract the individual text lines of any …
systems. The main purpose of this step is to extract the individual text lines of any …
Matching ottoman words: an image retrieval approach to historical document indexing
Large archives of Ottoman documents are challenging to many historians all over the world.
However, these archives remain inaccessible since manual transcription of such a huge …
However, these archives remain inaccessible since manual transcription of such a huge …
HAH manuscripts: A holistic paradigm for classifying and retrieving historical Arabic handwritten documents
Z Al Aghbari, S Brook - Expert Systems with Applications, 2009 - Elsevier
Technologies for reading and searching digital documents have helped academic
researchers; however, truly effective search engines for handwritten documents have not …
researchers; however, truly effective search engines for handwritten documents have not …
Efficient search in document image collections
This paper presents an efficient indexing and retrieval scheme for searching in document
image databases. In many non-European languages, optical character recognizers are not …
image databases. In many non-European languages, optical character recognizers are not …
Efficient algorithms for text lines and words segmentation for recognition of Arabic handwritten script
A new methodology for Arabic handwritten document images segmentation is done in this
paper to segment the documents into distinct entities as words and text lines. Based on …
paper to segment the documents into distinct entities as words and text lines. Based on …
A line-based representation for matching words in historical manuscripts
In this study, we propose a new method for retrieving and recognizing words in historical
documents. We represent word images with a set of line segments. Then we provide a …
documents. We represent word images with a set of line segments. Then we provide a …