Document layout analysis: a comprehensive survey

GM Binmakhashen, SA Mahmoud - ACM Computing Surveys (CSUR), 2019 - dl.acm.org
Document layout analysis (DLA) is a preprocessing step of document understanding
systems. It is responsible for detecting and annotating the physical structure of documents …

Document structure analysis algorithms: a literature survey

S Mao, A Rosenfeld, T Kanungo - Document recognition and …, 2003 - spiedigitallibrary.org
Document structure analysis can be regarded as a syntactic analysis problem. The order
and containment relations among the physical or logical components of a document page …

M6Doc: a large-scale multi-format, multi-type, multi-layout, multi-language, multi-annotation category dataset for modern document layout analysis

H Cheng, P Zhang, S Wu, J Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Document layout analysis is a crucial prerequisite for document understanding, including
document retrieval and conversion. Most public datasets currently contain only PDF …

The OCRopus open source OCR system

TM Breuel - Document recognition and retrieval XV, 2008 - spiedigitallibrary.org
OCRopus is a new, open source OCR system emphasizing modularity, easy extensibility,
and reuse, aimed at both the research community and large scale commercial document …

Multi-scale multi-task fcn for semantic page segmentation and table detection

D He, S Cohen, B Price, D Kifer… - 2017 14th IAPR …, 2017 - ieeexplore.ieee.org
Page segmentation and table detection play an important role in understanding the structure
of documents. We present a page segmentation algorithm that incorporates state-of-the-art …

ICDAR2017 competition on page object detection

L Gao, X Yi, Z Jiang, L Hao… - 2017 14th IAPR …, 2017 - ieeexplore.ieee.org
This paper presents the results of ICDAR2017 Competition on Page Object Detection (POD).
POD is to detect page objects (tables, mathematical equations, graphics, figures, etc.) from …

Fundamental diagrams for multidirectional pedestrian flows

S Cao, A Seyfried, J Zhang, S Holl… - Journal of Statistical …, 2017 - iopscience.iop.org
Fundamental diagrams for uni-, bi-and multidirectional flows at corridors and crossings are
investigated by a series of experiments under laboratory conditions. At high densities …

Label-efficient deep learning in medical image analysis: Challenges and future directions

C **, Z Guo, Y Lin, L Luo, H Chen - arxiv preprint arxiv:2303.12484, 2023 - arxiv.org
Deep learning has seen rapid growth in recent years and achieved state-of-the-art
performance in a wide range of applications. However, training models typically requires …

A comprehensive survey of mostly textual document segmentation algorithms since 2008

S Eskenazi, P Gomez-Krämer, JM Ogier - Pattern recognition, 2017 - Elsevier
In document image analysis, segmentation is the task that identifies the regions of a
document. The increasing number of applications of document analysis requires a good …

Handwritten Chinese text line segmentation by clustering with distance metric learning

F Yin, CL Liu - Pattern Recognition, 2009 - Elsevier
Separating text lines in unconstrained handwritten documents remains a challenge because
the handwritten text lines are often un-uniformly skewed and curved, and the space between …