[BOEK][B] The text mining handbook: advanced approaches in analyzing unstructured data

R Feldman, J Sanger - 2007 - books.google.com
Text mining is a new and exciting area of computer science research that tries to solve the
crisis of information overload by combining techniques from data mining, machine learning …

A survey of document image classification: problem statement, classifier architecture and performance evaluation

N Chen, D Blostein - International Journal of Document Analysis and …, 2007 - Springer
Document image classification is an important step in Office Automation, Digital Libraries,
and other document image analysis applications. There is great diversity in document image …

Text/graphics separation revisited

K Tombre, S Tabbone, L Pélissier, B Lamiroy… - … Analysis Systems V: 5th …, 2002 - Springer
Text/graphics separation aims at segmenting the document into two layers: a layer assumed
to contain text and a layer containing graphical objects. In this paper, we present a …

The concept of document warehousing for multi-dimensional modeling of textual-based business intelligence

FSC Tseng, AYH Chou - Decision Support Systems, 2006 - Elsevier
During the past decade, data warehousing has been widely adopted in the business
community. It provides multi-dimensional analyses on cumulated historical business data for …

Document image classification with intra-domain transfer learning and stacked generalization of deep convolutional neural networks

A Das, S Roy, U Bhattacharya… - 2018 24th international …, 2018 - ieeexplore.ieee.org
In this article, a region-based Deep Convolutional Neural Network framework is presented
for document structure learning. The contribution of this work involves efficient training of …

Logo and seal based administrative document image retrieval: a survey

A Alaei, PP Roy, U Pal - Computer Science Review, 2016 - Elsevier
With the advance of technology, business offices and organizations together with their
clients create a massive amount of administrative documents every day. Administrative …

Hidden tree Markov models for document image classification

M Diligenti, P Frasconi, M Gori - IEEE Transactions on pattern …, 2003 - ieeexplore.ieee.org
Classification is an important problem in image document processing and is often a
preliminary step toward recognition, understanding, and information extraction. In this paper …

Embedded textual content for document image classification with convolutional neural networks

L Noce, I Gallo, A Zamberletti, A Calefati - Proceedings of the 2016 ACM …, 2016 - dl.acm.org
In this paper we introduce a novel document image classification method based on
combined visual and textual information. The proposed algorithm's pipeline is inspired to the …

Analysis and understanding of multi-class invoices

F Cesarini, E Francesconi, M Gori, G Soda - Document Analysis and …, 2003 - Springer
In this paper a system for processing documents that can be grouped into classes is
illustrated. We have considered invoices as a case-study. The system is divided into three …

Tsallis mutual information for document classification

M Vila, A Bardera, M Feixas, M Sbert - Entropy, 2011 - mdpi.com
Mutual information is one of the mostly used measures for evaluating image similarity. In this
paper, we investigate the application of three different Tsallis-based generalizations of …