Chroniclingamericaqa: A large-scale question answering dataset based on historical american newspaper pages
Question answering (QA) and Machine Reading Comprehension (MRC) tasks have
significantly advanced in recent years due to the rapid development of deep learning …
significantly advanced in recent years due to the rapid development of deep learning …
Leveraging open large language models for historical named entity recognition
The efficacy of large-scale language models (LLMs) as few-shot learners has dominated the
field of natural language processing, achieving state-of-the-art performance in most tasks …
field of natural language processing, achieving state-of-the-art performance in most tasks …
Playertv: Advanced player tracking and identification for automatic soccer highlight clips
In the rapidly evolving field of sports analytics, the automation of targeted video processing
is a pivotal advancement. We propose PlayerTV, an innovative framework which harnesses …
is a pivotal advancement. We propose PlayerTV, an innovative framework which harnesses …
Injecting temporal-aware knowledge in historical named entity recognition
In this paper, we address the detection of named entities in multilingual historical collections.
We argue that, besides the multiple challenges that depend on the quality of digitization (eg …
We argue that, besides the multiple challenges that depend on the quality of digitization (eg …
Deep learning approaches for information extraction from visually rich documents: datasets, challenges and methods
This paper focuses on Information Extraction from Visually Rich Documents, exploring how
deep learning methods are applied in this field. For the purpose of comparing the …
deep learning methods are applied in this field. For the purpose of comparing the …
MHlinker: Research on a Joint Extraction Method of Fault Entity Relationship for Mine Hoist
X Dang, H Deng, X Dong, Z Zhu, F Li, L Wang - Electronics, 2023 - mdpi.com
Triplet extraction is the key technology to automatically construct knowledge graphs.
Extracting the triplet of mechanical equipment fault relationships is of great significance in …
Extracting the triplet of mechanical equipment fault relationships is of great significance in …
Confidence-Aware Document OCR Error Detection
Abstract Optical Character Recognition (OCR) continues to face accuracy challenges that
impact subsequent applications. To address these errors, we explore the utility of OCR …
impact subsequent applications. To address these errors, we explore the utility of OCR …
Enhancing OCR with line segmentation mask for container text recognition in container terminal
Abstract Optical Character Recognition (OCR) plays a pivotal role in enhancing the
operational efficiency of container ports. However, challenges such as angle limitations and …
operational efficiency of container ports. However, challenges such as angle limitations and …
Text Role Classification in Scientific Charts Using Multimodal Transformers
Text role classification involves classifying the semantic role of textual elements within
scientific charts. We propose to finetune the multimodal document layout analysis models …
scientific charts. We propose to finetune the multimodal document layout analysis models …
Generalizability in Document Layout Analysis for Scientific Article Figure & Caption Extraction
JP Naiman - arxiv preprint arxiv:2301.10781, 2023 - arxiv.org
The lack of generalizability--in which a model trained on one dataset cannot provide
accurate results for a different dataset--is a known problem in the field of document layout …
accurate results for a different dataset--is a known problem in the field of document layout …