Machine learning in automated text categorization

F Sebastiani - ACM computing surveys (CSUR), 2002 - dl.acm.org
The automated categorization (or classification) of texts into predefined categories has
witnessed a booming interest in the last 10 years, due to the increased availability of …

[PDF][PDF] A tutorial on automated text categorisation

F Sebastiani - Proceedings of ASAI-99, 1st Argentinian …, 1999 - backup.blackwinter.de
The automated categorisation (or classification) of texts into topical categories has a long
history, dating back at least to 1960. Until the late'80s, the dominant approach to the problem …

[SÁCH][B] Modern information retrieval

R Baeza-Yates, B Ribeiro-Neto - 1999 - people.ischool.berkeley.edu
Information retrieval (IR) has changed considerably in recent years with the expansion of the
World Wide Web and the advent of modern and inexpensive graphical user interfaces and …

[PDF][PDF] A comparative study on feature selection in text categorization

Y Yang, JO Pedersen - Icml, 1997 - Citeseer
This paper is a comparative study of feature selection methods in statistical learning of text
categorization. The focus is on aggressive dimensionality reduction. Five methods were …

Optimizing search engines using clickthrough data

T Joachims - Proceedings of the eighth ACM SIGKDD international …, 2002 - dl.acm.org
This paper presents an approach to automatically optimizing the retrieval quality of search
engines using clickthrough data. Intuitively, a good information retrieval system should …

A re-examination of text categorization methods

Y Yang, X Liu - Proceedings of the 22nd annual international ACM …, 1999 - dl.acm.org
This paper reports a controlled study with statistical significance tests on five text
categorization methods: the Support Vector Machines (SVM), a k-Nearest Neighbor (kNN) …

An evaluation of statistical approaches to text categorization

Y Yang - Information retrieval, 1999 - Springer
This paper focuses on a comparative evaluation of a wide-range of text categorization
methods, including previously published results on the Reuters corpus and new results of …

[SÁCH][B] Learning to classify text using support vector machines

T Joachims - 2012 - books.google.com
Text Classification, or the task of automatically assigning semantic categories to natural
language text, has become one of the key methods for organizing online information. Since …

Inductive learning algorithms and representations for text categorization

S Dumais, J Platt, D Heckerman… - Proceedings of the seventh …, 1998 - dl.acm.org
Text categorization–the assignment of natural language texts to one or more predefined
categories based on their content–is an important component in many information …

[PDF][PDF] Feature selection and feature extraction for text categorization

DD Lewis - Speech and Natural Language: Proceedings of a …, 1992 - aclanthology.org
The effect of selecting varying numbers and kinds of features for use in predicting category
membership was investigated on the Reuters and MUC-3 text categorization data sets …