An overview on XML similarity: Background, current trends and future directions

J Tekli, R Chbeir, K Yetongnon - Computer science review, 2009 - Elsevier
In recent years, XML has been established as a major means for information management,
and has been broadly utilized for complex data representation (eg multimedia objects) …

An overview on xml semantic disambiguation from unstructured text to semi-structured data: Background, applications, and ongoing challenges

J Tekli - IEEE Transactions on Knowledge and Data …, 2016 - ieeexplore.ieee.org
Since the last two decades, XML has gained momentum as the standard for web information
management and complex data representation. Also, collaboratively built semi-structured …

Detecting and characterizing bots that commit code

T Dey, S Mousavi, E Ponce, T Fry, B Vasilescu… - Proceedings of the 17th …, 2020 - dl.acm.org
Background: Some developer activity traditionally performed manually, such as making
code commits, opening, managing, or closing issues is increasingly subject to automation in …

Keyword search over relational databases: a metadata approach

S Bergamaschi, E Domnori, F Guerra… - Proceedings of the …, 2011 - dl.acm.org
Keyword queries offer a convenient alternative to traditional SQL in querying relational
databases with large, often unknown, schemas and instances. The challenge in answering …

The pq-gram distance between ordered labeled trees

N Augsten, M Böhlen, J Gamper - ACM Transactions on Database …, 2008 - dl.acm.org
When integrating data from autonomous sources, exact matches of data items that represent
the same real-world object often fail due to a lack of common keys. Yet in many cases …

A novel XML document structure comparison framework based-on sub-tree commonalities and label semantics

J Tekli, R Chbeir - Journal of Web Semantics, 2012 - Elsevier
XML similarity evaluation has become a central issue in the database and information
communities, its applications ranging over document clustering, version control, data …

A Family of LZ78-based Universal Sequential Probability Assignments

N Sagan, T Weissman - arxiv preprint arxiv:2410.06589, 2024 - arxiv.org
We propose and study a family of universal sequential probability assignments on individual
sequences, based on the incremental parsing procedure of the Lempel-Ziv (LZ78) …

Clustering XML documents by patterns

M Piernik, D Brzezinski, T Morzy - Knowledge and Information Systems, 2016 - Springer
Now that the use of XML is prevalent, methods for mining semi-structured documents have
become even more important. In particular, one of the areas that could greatly benefit from in …

XML clustering: a review of structural approaches

M Piernik, D Brzezinski, T Morzy… - The Knowledge …, 2015 - cambridge.org
With its presence in data integration, chemistry, biological, and geographic systems,
eXtensible Markup Language (XML) has become an important standard not only in …

X-class: Associative classification of xml documents by structure

G Costa, R Ortale, E Ritacco - ACM Transactions on Information Systems …, 2013 - dl.acm.org
The supervised classification of XML documents by structure involves learning predictive
models in which certain structural regularities discriminate the individual document classes …