Web mining in soft computing framework: relevance, state of the art and future directions

SK Pal, V Talwar, P Mitra - IEEE transactions on neural …, 2002 - ieeexplore.ieee.org
The paper summarizes the different characteristics of Web data, the basic components of
Web mining and its different types, and the current state of the art. The reason for …

Adaptive information extraction

J Turmo, A Ageno, N Catala - ACM Computing Surveys (CSUR), 2006 - dl.acm.org
The growing availability of online textual sources and the potential number of applications of
knowledge acquisition from textual data has lead to an increase in Information Extraction …

[KİTAP][B] Machine learning for text: An introduction

CC Aggarwal, CC Aggarwal - 2018 - Springer
The extraction of useful insights from text with various types of statistical algorithms is
referred to as text mining, text analytics, or machine learning from text. The choice of …

[PDF][PDF] Incorporating non-local information into information extraction systems by gibbs sampling

JR Finkel, T Grenager, CD Manning - Proceedings of the 43rd …, 2005 - aclanthology.org
Most current statistical natural language processing models use only local features so as to
permit dynamic programming in inference, but this makes them unable to fully account for …

Automating the construction of internet portals with machine learning

AK McCallum, K Nigam, J Rennie, K Seymore - Information Retrieval, 2000 - Springer
Abstract Domain-specific internet portals are growing in popularity because they gather
content from the Web and organize it for easy access, retrieval and search. For example …

[PDF][PDF] Maximum entropy Markov models for information extraction and segmentation.

A McCallum, D Freitag, FCN Pereira - Icml, 2000 - cseweb.ucsd.edu
Maximum Entropy Markov Models for Information Extraction and Segmentation Page 1 1
Maximum Entropy Markov Models for Information Extraction and Segmentation Andrew …

Web mining research: A survey

R Kosala, H Blockeel - ACM Sigkdd Explorations Newsletter, 2000 - dl.acm.org
With the huge amount of information available online, the World Wide Web is a fertile area
for data mining research. The Web mining research is at the cross road of research from …

Unsupervised named-entity extraction from the web: An experimental study

O Etzioni, M Cafarella, D Downey, AM Popescu… - Artificial intelligence, 2005 - Elsevier
The KnowItAll system aims to automate the tedious process of extracting large collections of
facts (eg, names of scientists or politicians) from the Web in an unsupervised, domain …

Web-scale information extraction in knowitall: (preliminary results)

O Etzioni, M Cafarella, D Downey, S Kok… - Proceedings of the 13th …, 2004 - dl.acm.org
Manually querying search engines in order to accumulate a large bodyof factual information
is a tedious, error-prone process of piecemealsearch. Search engines retrieve and rank …

Dynamic conditional random fields: Factorized probabilistic models for labeling and segmenting sequence data

C Sutton, A McCallum, K Rohanimanesh - Journal of Machine Learning …, 2007 - jmlr.org
In sequence modeling, we often wish to represent complex interaction between labels, such
as when performing multiple, cascaded labeling tasks on the same sequence, or when long …