An overview of web data clustering practices

A Vakali, J Pokorný, T Dalamagas - International conference on extending …, 2004 - Springer
Clustering is a challenging topic in the area of Web data management. Various forms of
clustering are required in a wide range of applications, including finding mirrored Web …

Method and system for processing documents through document history encapsulation

JY Vion-Dury - US Patent 9,448,986, 2016 - Google Patents
(57) ABSTRACT A computer-implemented system and method for processing a markup
language document and its change history are provided. The method includes receiving first …

A change detection system for unordered XML data using a relational model

S Sundaram, SK Madria - Data & Knowledge Engineering, 2012 - Elsevier
The dramatic increase in the evolution of XML data available on the Internet requires a
change detection system to keep track of important changes occurring during their life time …

Integration of web sources under uncertainty and dependencies using probabilistic XML

ML Ba, S Montenez, R Tang, T Abdessalem - Database Systems for …, 2014 - Springer
We study in this vision paper the problem of integrating several web data sources under
uncertainty and dependencies. We present a concrete application with web sources about …

[PDF][PDF] A novel Web archiving approach based on visual pages analysis

MB Saad, S Gançarski, Z Pehlivan - The 9 th International Web Archiving …, 2009 - Citeseer
Due to the growing importance of the World Wide Web, archiving the web has become a
cultural necessity in preserving knowledge. To maintain a web archive up-to-date, crawlers …

Accurate and efficient html differencing

R Mikhaiel, E Stroulia - 13th IEEE International Workshop on …, 2005 - ieeexplore.ieee.org
Recognizing the differences between subsequent versions of HTML documents is an
important problem. It is useful for managers of multi-authored Web sites who need to review …

[PDF][PDF] Merging Uncertain Multi-Version XML Documents.

ML Ba, T Abdessalem, P Senellart - DChanges, 2013 - ceur-ws.org
Merging is a fundamental operation in revision control systems that enables integrating
different changes made to the same documents. In open platforms, such as Wikipedia …

Diffing, patching and merging XML documents: toward a generic calculus of editing deltas.

JY Vion-Dury - Proceedings of the 10th ACM symposium on Document …, 2010 - dl.acm.org
This work addresses what we believe to be a central issue in the field of XML diff and merge
computation: the mathematical modeling of the so-called" editing deltas" and the study of …

[PDF][PDF] XML Diff and patch tool

K Komvoteas - MS in Distributed Multimedia and Information Systems …, 2003 - Citeseer
The increasing use of XML the last few years, led to the creation of many differencing and
patching tools capable of handling tree-structured documents. However, all of those tools …

On change detection of XML Schemas

A Baqasah, E Pardede, I Holubova… - 2013 12th IEEE …, 2013 - ieeexplore.ieee.org
Change detection of XML data has emerged as an important research issue in the last
decade, however the majority of change detection algorithms focuses on XML documents …