A survey of OCR in Arabic language: applications, techniques, and challenges

S Faizullah, MS Ayub, S Hussain, MA Khan - Applied Sciences, 2023 - mdpi.com
Optical character recognition (OCR) is the process of extracting handwritten or printed text
from a scanned or printed image and converting it to a machine-readable form for further …

Exploring AI-driven approaches for unstructured document analysis and future horizons

SV Mahadevkar, S Patil, K Kotecha, LW Soong… - Journal of Big Data, 2024 - Springer
In the current industrial landscape, a significant number of sectors are grappling with the
challenges posed by unstructured data, which incurs financial losses amounting to millions …

Advancing OCR Accuracy in Image-to-LaTeX Conversion—A Critical and Creative Exploration

EZ Orji, A Haydar, İ Erşan, OO Mwambe - Applied Sciences, 2023 - mdpi.com
This paper comprehensively assesses the application of active learning strategies to
enhance natural language processing-based optical character recognition (OCR) models for …

F2M: Ensemble-based uncertainty estimation model for fire detection in indoor environments

M Arlović, M Patel, J Balen, F Hržić - Engineering applications of artificial …, 2024 - Elsevier
Early fire detection and timely notification are paramount for preventing human and material
casualties caused by fire. As a result, scientists have developed various fire monitoring …

[HTML][HTML] An efficient method for disaster tweets classification using gradient-based optimized convolutional neural networks with BERT embeddings

D Dharrao, MR Aadithyanarayanan, R Mital, A Vengali… - MethodsX, 2024 - Elsevier
Event of the disastrous scenarios are actively discussed on microblogging platforms like
Twitter which can lead to chaotic situations. In the era of machine learning and deep …

A comparison of deep transfer learning backbone architecture techniques for printed text detection of different font styles from unstructured documents

S Mahadevkar, S Patil, K Kotecha, A Abraham - PeerJ Computer Science, 2024 - peerj.com
Object detection methods based on deep learning have been used in a variety of sectors
including banking, healthcare, e-governance, and academia. In recent years, there has …

[HTML][HTML] Region Segmentation of Images Based on a Raster-Scan Paradigm

L Lukač, A Nerat, D Strnad, Š Horvat… - Journal of Sensor and …, 2024 - mdpi.com
This paper introduces a new method for the region segmentation of images. The approach is
based on the raster-scan paradigm and builds the segments incrementally. The pixels are …

PEaCE: A Chemistry-Oriented Dataset for Optical Character Recognition on Scientific Documents

N Zhang, C Heaton, ST Okonsky, P Mitra… - arxiv preprint arxiv …, 2024 - arxiv.org
Optical Character Recognition (OCR) is an established task with the objective of identifying
the text present in an image. While many off-the-shelf OCR models exist, they are often …

Sentiment Analysis of Beauty Product Reviews Using the IndoBERT Method and Naive Bayes Classification

HM Ramdhan, MD Purbolaksono… - … on Information and …, 2024 - ieeexplore.ieee.org
This paper presents an integrated approach to sentiment analysis of beauty product reviews
using the IndoBERT model combined with Naive Bayes classification, which specifically …

[PDF][PDF] Applicability of OCR Engines for Text Recognition in Vehicle Number Plates, Receipts and Handwriting.

U Poudel, AM Regmi, Z Stamenkovic… - J. Circuits Syst …, 2023 - researchgate.net
U. Poudel et al. experiments conducted on five different image categories: vehicle number
plates, receipts, handwriting, symbols and plain text images. Evaluation metrics such as …