A survey of Web crawlers for information retrieval
Performance of any search engine relies heavily on its Web crawler. Web crawlers are the
programs that get webpages from the Web by following hyperlinks. These webpages are …
programs that get webpages from the Web by following hyperlinks. These webpages are …
[LIVRE][B] Principles of data integration
Principles of Data Integration is the first comprehensive textbook of data integration,
covering theoretical principles and implementation issues as well as current challenges …
covering theoretical principles and implementation issues as well as current challenges …
Crisislex: A lexicon for collecting and filtering microblogged communications in crises
Locating timely, useful information during crises and mass emergencies is critical for those
forced to make potentially life-altering decisions. As the use of Twitter to broadcast useful …
forced to make potentially life-altering decisions. As the use of Twitter to broadcast useful …
Web crawling
This is a survey of the science and practice of web crawling. While at first glance web
crawling may appear to be merely an application of breadth-first-search, the truth is that …
crawling may appear to be merely an application of breadth-first-search, the truth is that …
Crawling Ajax-based web applications through dynamic analysis of user interface state changes
Using JavaScript and dynamic DOM manipulation on the client side of Web applications is
becoming a widespread approach for achieving rich interactivity and responsiveness in …
becoming a widespread approach for achieving rich interactivity and responsiveness in …
Google's deep web crawl
J Madhavan, D Ko, Ł Kot, V Ganapathy… - Proceedings of the …, 2008 - dl.acm.org
The Deep Web, ie, content hidden behind HTML forms, has long been acknowledged as a
significant gap in search engine coverage. Since it represents a large portion of the …
significant gap in search engine coverage. Since it represents a large portion of the …
Crawling Ajax by inferring user interface state changes
Ajax is a very promising approach for improving rich interactivity and responsiveness of web
applications. At the same time, Ajax techniques shatter the metaphor of a web" page" upon …
applications. At the same time, Ajax techniques shatter the metaphor of a web" page" upon …
Customer churn analysis in telecom industry
K Dahiya, S Bhatia - 2015 4th International Conference on …, 2015 - ieeexplore.ieee.org
With the rapid development of telecommunication industry, the service providers are inclined
more towards expansion of the subscriber base. To meet the need of surviving in the …
more towards expansion of the subscriber base. To meet the need of surviving in the …
Query selection techniques for efficient crawling of structured web sources
The high quality, structured data from Web structured sources is invaluable for many
applications. Hidden Web databases are not directly crawlable by Web search engines and …
applications. Hidden Web databases are not directly crawlable by Web search engines and …
Structured data on the web
Structured data on the web Page 1 72 communicAtions of tHe Acm | FeBrUAry 2011 | voL. 54 |
No. 2 contributed articles ThoUgh The WeB is best known as a vast repository of shared …
No. 2 contributed articles ThoUgh The WeB is best known as a vast repository of shared …