X-Diff: An effective change detection algorithm for XML documents

Y Wang, DJ DeWitt, JY Cai - Proceedings 19th international …, 2003 - ieeexplore.ieee.org
XML has become the de facto standard format for Web publishing and data transportation.
Since online information changes frequently, being able to quickly detect changes in XML …

Method and apparatus for focused crawling

D Jiang, A Krishnamurthy, JP Singh, R Wang - US Patent 7,080,073, 2006 - Google Patents
US7080073B1 - Method and apparatus for focused crawling - Google Patents
US7080073B1 - Method and apparatus for focused crawling - Google Patents Method and …

Newsjunkie: providing personalized newsfeeds via analysis of information novelty

E Gabrilovich, S Dumais, E Horvitz - Proceedings of the 13th …, 2004 - dl.acm.org
We present a principled methodology for filtering news stories by formal measures of
information novelty, and show how the techniques can be usedto custom-tailor news feeds …

Rate of change and other metrics: a live study of the world wide web

F Douglis, A Feldmann, B Krishnamurthy… - USENIX Symposium on …, 1997 - usenix.org
Rate of Change and other Metrics: a Live Study of the World Wide Web Page 1 The following
paper was originally published in the Proceedings of the USENIX Symposium on Internet …

Method and apparatus for searching network resources

JP Singh, R Wang - US Patent 7,415,469, 2008 - Google Patents
US7415469B2 - Method and apparatus for searching network resources - Google Patents
US7415469B2 - Method and apparatus for searching network resources - Google Patents …

WebBase: A repository of web pages

J Hirai, S Raghavan, H Garcia-Molina, A Paepcke - Computer Networks, 2000 - Elsevier
In this paper, we study the problem of constructing and maintaining a large shared
repository of Web pages. We discuss the unique characteristics of such a repository …

A short survey of document structure similarity algorithms

D Buttler - 2004 - osti.gov
This paper provides a brief survey of document structural similarity algorithms, including the
optimal Tree Edit Distance algorithm and various approximation algorithms. The …

Method and apparatus for searching network resources

JP Singh, R Wang - US Patent 6,915,294, 2005 - Google Patents
US6915294B1 - Method and apparatus for searching network resources - Google Patents
US6915294B1 - Method and apparatus for searching network resources - Google Patents …

Host exchange in bill paying services

PA Hazlehurst, C Alvarez - US Patent 7,856,386, 2010 - Google Patents
An account exchange system is provided by a data aggregation service enabled for
gathering data for a subscriber from a data repository of a first financial institution, using …

Principles and methods for personalizing newsfeeds via an analysis of information novelty and dynamics

ST Dumais, EJ Horvitz, E Gabrilovich - US Patent 7,293,019, 2007 - Google Patents
US7293019B2 - Principles and methods for personalizing newsfeeds via an analysis of
information novelty and dynamics - Google Patents US7293019B2 - Principles and methods for …