Hierarchical unsupervised partitioning of large size data and its application to hyperspectral images
In this paper, we propose a true unsupervised method to partition large-size images, where
the number of classes, training samples, and other a priori information is not known. Thus …
the number of classes, training samples, and other a priori information is not known. Thus …
Multiem: Efficient and effective unsupervised multi-table entity matching
Entity Matching (EM), which aims to identify all pairs of records referring to the same real-
world entity from relational tables, is one of the most important tasks in real-world data …
world entity from relational tables, is one of the most important tasks in real-world data …
Duplicate table discovery with xash
Data lakes are typically lightly curated and as such prone to data quality problems and
inconsistencies. In particular, duplicate tables are common in most repositories. The goal of …
inconsistencies. In particular, duplicate tables are common in most repositories. The goal of …
[PDF][PDF] Matching Entities from Multiple Sources with Hierarchical Agglomerative Clustering.
We propose extensions to Hierarchical Agglomerative Clustering (HAC) to match and cluster
entities from multiple sources that can be either duplicate-free or dirty. The proposed …
entities from multiple sources that can be either duplicate-free or dirty. The proposed …
[BOOK][B] Duplicate Table Detection with Xash
Data lakes are typically lightly curated and as such prone to data quality problems and
inconsistencies. In particular, duplicate tables are common in most repositories. The goal of …
inconsistencies. In particular, duplicate tables are common in most repositories. The goal of …
Quality prediction of multi-stage batch process based on integrated ConvBiGRU with attention mechanism
K Liu, X Zhao, M Mou, Y Hui - Applied Intelligence, 2025 - Springer
It is important for quality prediction and monitoring to ensure the safe operation of the
process. When constructing a prediction model, it is crucial to choose appropriate input …
process. When constructing a prediction model, it is crucial to choose appropriate input …
Graph-based Active Learning for Entity Cluster Repair
Cluster repair methods aim to determine errors in clusters and modify them so that each
cluster consists of records representing the same entity. Current cluster repair …
cluster consists of records representing the same entity. Current cluster repair …
Stop Relearning: Model Reuse via Feature Distribution Analysis for Incremental Entity Resolution
Entity resolution is essential for data integration, facilitating analytics and insights from
complex systems. Multi-source and incremental entity resolution address the challenges of …
complex systems. Multi-source and incremental entity resolution address the challenges of …
[PDF][PDF] Martin Franke
F Rohde, E Rahm - 2024 - openproceedings.org
Record linkage is the task of identifying records from different databases that refer to the
same real-world entity. This task is an essential component of data integration to facilitate …
same real-world entity. This task is an essential component of data integration to facilitate …
[PDF][PDF] Leveraging Clustering Algorithms on Connected Components for Entity Resolution
F Hoogmoed - 2023 - studenttheses.uu.nl
Entity resolution is a critical task in enhancing data quality and ensuring the reliability of data
analytics, as it involves identifying distinct records in a dataset that correspond to the same …
analytics, as it involves identifying distinct records in a dataset that correspond to the same …