Large-scale deduplication with constraints using dedupalog

A Arasu, C Ré, D Suciu - 2009 IEEE 25th International …, 2009 - ieeexplore.ieee.org
We present a declarative framework for collective deduplication of entity references in the
presence of constraints. Constraints occur naturally in many data cleaning domains and can …

Constrained clustering: Current and new trends

P Gançarski, TBH Dao, B Crémilleux… - A Guided Tour of …, 2020 - Springer
Clustering is an unsupervised process which aims to discover regularities and underlying
structures in data. Constrained clustering extends clustering in such a way that expert …

Constrained distance based clustering for time-series: a comparative and experimental study

T Lampert, TBH Dao, B Lafabregue, N Serrette… - Data Mining and …, 2018 - Springer
Constrained clustering is becoming an increasingly popular approach in data mining. It
offers a balance between the complexity of producing a formal definition of thematic classes …

Constrained locally weighted clustering

H Cheng, KA Hua, K Vu - Proceedings of the VLDB Endowment, 2008 - dl.acm.org
Data clustering is a difficult problem due to the complex and heterogeneous natures of
multidimensional data. To improve clustering accuracy, we propose a scheme to capture the …

A novel semi-supervised approach for network traffic clustering

Y Wang, Y **ang, J Zhang, S Yu - 2011 5th International …, 2011 - ieeexplore.ieee.org
Network traffic classification is an essential component for network management and
security systems. To address the limitations of traditional port-based and payload-based …

Survey on using constraints in data mining

V Grossi, A Romei, F Turini - Data mining and knowledge discovery, 2017 - Springer
This paper provides an overview of the current state-of-the-art on using constraints in
knowledge discovery and data mining. The use of constraints in a data mining task requires …

Pairwise constraint propagation with dual adversarial manifold regularization

Y Jia, H Liu, J Hou, S Kwong - IEEE transactions on neural …, 2020 - ieeexplore.ieee.org
Pairwise constraints (PCs) composed of must-links (MLs) and cannot-links (CLs) are widely
used in many semisupervised tasks. Due to the limited number of PCs, pairwise constraint …