A survey on deep domain adaptation and tiny object detection challenges, techniques and datasets

M Muzammul, X Li - arxiv preprint arxiv:2107.07927, 2021 - arxiv.org
This survey paper specially analyzed computer vision-based object detection challenges
and solutions by different techniques. We mainly highlighted object detection by three …

A Benchmark of Named Entity Recognition Approaches in Historical Documents Application to 19 Century French Directories

N Abadie, E Carlinet, J Chazalon… - International Workshop on …, 2022 - Springer
Named entity recognition (NER) is a necessary step in many pipelines targeting historical
documents. Indeed, such natural language processing techniques identify which class each …

A benchmark of nested named entity recognition approaches in historical structured documents

S Tual, N Abadie, J Chazalon, B Duménieu… - … on Document Analysis …, 2023 - Springer
Abstract Named Entity Recognition (NER) is a key step in the creation of structured data from
digitised historical documents. Traditional NER approaches deal with flat named entities …

Document Layout Analysis with Deep Learning and Heuristics

V Rezanezhad, K Baierer, M Gerber… - Proceedings of the 7th …, 2023 - dl.acm.org
The automated yet highly accurate layout analysis (segmentation) of historical document
images remains a key challenge for the improvement of Optical Character Recognition …

OCR improvements for images of multi-page historical documents

I Gruber, M Hrúz, P Ircing, P Neduchal, T Zítka… - … Conference on Speech …, 2021 - Springer
This work presents a pipeline for processing digitally scanned documents, reading their
textual content, and storing it in a dataset for the purpose of information retrieval. The …

Towards Writing Style Adaptation in Handwriting Recognition

J Kohút, M Hradiš, M Kišš - International Conference on Document …, 2023 - Springer
One of the challenges of handwriting recognition is to transcribe a large number of vastly
different writing styles. State-of-the-art approaches do not explicitly use information about the …

Between History and Natural Language Processing: Study, Enrichment and Online Publication of French Parliamentary Debates of the Early Third Republic (1881 …

M Puren, A Pellet, N Bourgeois, P Vernus… - ParlaCLARIN III at …, 2022 - hal.science
We present the AGODA (Analyse sémantique et Graphes relationnels pour l'Ouverture des
Débats à l'Assemblée nationale) project, which aims to create a platform for consulting and …

Fine-Tuning is a Surprisingly Effective Domain Adaptation Baseline in Handwriting Recognition

J Kohút, M Hradiš - International Conference on Document Analysis and …, 2023 - Springer
In many machine learning tasks, a large general dataset and a small specialized dataset are
available. In such situations, various domain adaptation methods can be used to adapt a …

A processing chain for extracting and providing online access to annotated and semantically enriched historical data. The AGODA project

P Vernus, A Pellet, N Bourgeois, F Lebreton… - Digital Humanities …, 2022 - hal.science
The AGODA project is one of five pilot projects supported by the DataLab of the Bibliothèque
nationale de France. It aims to create an online platform facilitating the exploration and use …

面向采购文件的跨模态图片文本命名实体识别.

杨赛, 刘昕, 于绍文 - Journal of Computer Engineering & …, 2024 - search.ebscohost.com
智慧供应链的数智化采购环节能够提高采购工作效率, 节省大量人力成本. 采购文件中包括大量
证照资质等文件, 针对其中图片文本中文字排版参差不齐, 扫描图像不清晰等问题 …