A survey of Web crawlers for information retrieval

M Kumar, R Bhatia, D Rattan - Wiley Interdisciplinary Reviews …, 2017 - Wiley Online Library
Performance of any search engine relies heavily on its Web crawler. Web crawlers are the
programs that get webpages from the Web by following hyperlinks. These webpages are …

[LIVRE][B] Principles of data integration

AH Doan, A Halevy, Z Ives - 2012 - books.google.com
Principles of Data Integration is the first comprehensive textbook of data integration,
covering theoretical principles and implementation issues as well as current challenges …

Crisislex: A lexicon for collecting and filtering microblogged communications in crises

A Olteanu, C Castillo, F Diaz, S Vieweg - Proceedings of the …, 2014 - ojs.aaai.org
Locating timely, useful information during crises and mass emergencies is critical for those
forced to make potentially life-altering decisions. As the use of Twitter to broadcast useful …

Web crawling

C Olston, M Najork - Foundations and Trends® in Information …, 2010 - nowpublishers.com
This is a survey of the science and practice of web crawling. While at first glance web
crawling may appear to be merely an application of breadth-first-search, the truth is that …

Crawling Ajax-based web applications through dynamic analysis of user interface state changes

A Mesbah, A Van Deursen, S Lenselink - ACM Transactions on the Web …, 2012 - dl.acm.org
Using JavaScript and dynamic DOM manipulation on the client side of Web applications is
becoming a widespread approach for achieving rich interactivity and responsiveness in …

Google's deep web crawl

J Madhavan, D Ko, Ł Kot, V Ganapathy… - Proceedings of the …, 2008 - dl.acm.org
The Deep Web, ie, content hidden behind HTML forms, has long been acknowledged as a
significant gap in search engine coverage. Since it represents a large portion of the …

Crawling Ajax by inferring user interface state changes

A Mesbah, E Bozdag… - 2008 eighth international …, 2008 - ieeexplore.ieee.org
Ajax is a very promising approach for improving rich interactivity and responsiveness of web
applications. At the same time, Ajax techniques shatter the metaphor of a web" page" upon …

Customer churn analysis in telecom industry

K Dahiya, S Bhatia - 2015 4th International Conference on …, 2015 - ieeexplore.ieee.org
With the rapid development of telecommunication industry, the service providers are inclined
more towards expansion of the subscriber base. To meet the need of surviving in the …

Query selection techniques for efficient crawling of structured web sources

P Wu, JR Wen, H Liu, WY Ma - 22nd International Conference …, 2006 - ieeexplore.ieee.org
The high quality, structured data from Web structured sources is invaluable for many
applications. Hidden Web databases are not directly crawlable by Web search engines and …

Structured data on the web

MJ Cafarella, A Halevy, J Madhavan - Communications of the ACM, 2011 - dl.acm.org
Structured data on the web Page 1 72 communicAtions of tHe Acm | FeBrUAry 2011 | voL. 54 |
No. 2 contributed articles ThoUgh The WeB is best known as a vast repository of shared …