Hierarchical unsupervised partitioning of large size data and its application to hyperspectral images

J Alameddine, K Chehdi, C Cariou - Remote Sensing, 2021 - mdpi.com
In this paper, we propose a true unsupervised method to partition large-size images, where
the number of classes, training samples, and other a priori information is not known. Thus …

Multiem: Efficient and effective unsupervised multi-table entity matching

X Zeng, P Wang, Y Mao, L Chen… - 2024 IEEE 40th …, 2024 - ieeexplore.ieee.org
Entity Matching (EM), which aims to identify all pairs of records referring to the same real-
world entity from relational tables, is one of the most important tasks in real-world data …

Duplicate table discovery with xash

M Koch, M Esmailoghli, S Auer, Z Abedjan - 2023 - dl.gi.de
Data lakes are typically lightly curated and as such prone to data quality problems and
inconsistencies. In particular, duplicate tables are common in most repositories. The goal of …

[PDF][PDF] Matching Entities from Multiple Sources with Hierarchical Agglomerative Clustering.

A Saeedi, L David, E Rahm - KEOD, 2021 - scitepress.org
We propose extensions to Hierarchical Agglomerative Clustering (HAC) to match and cluster
entities from multiple sources that can be either duplicate-free or dirty. The proposed …

[BOOK][B] Duplicate Table Detection with Xash

M Koch, M Esmailoghli, S Auer, Z Abedjan - 2023 - repo.uni-hannover.de
Data lakes are typically lightly curated and as such prone to data quality problems and
inconsistencies. In particular, duplicate tables are common in most repositories. The goal of …

Quality prediction of multi-stage batch process based on integrated ConvBiGRU with attention mechanism

K Liu, X Zhao, M Mou, Y Hui - Applied Intelligence, 2025 - Springer
It is important for quality prediction and monitoring to ensure the safe operation of the
process. When constructing a prediction model, it is crucial to choose appropriate input …

Graph-based Active Learning for Entity Cluster Repair

V Christen, D Obraczka, M Hofer, M Franke… - arxiv preprint arxiv …, 2024 - arxiv.org
Cluster repair methods aim to determine errors in clusters and modify them so that each
cluster consists of records representing the same entity. Current cluster repair …

Stop Relearning: Model Reuse via Feature Distribution Analysis for Incremental Entity Resolution

V Christen, A Sabra, E Rahm - arxiv preprint arxiv:2412.09355, 2024 - arxiv.org
Entity resolution is essential for data integration, facilitating analytics and insights from
complex systems. Multi-source and incremental entity resolution address the challenges of …

[PDF][PDF] Martin Franke

F Rohde, E Rahm - 2024 - openproceedings.org
Record linkage is the task of identifying records from different databases that refer to the
same real-world entity. This task is an essential component of data integration to facilitate …

[PDF][PDF] Leveraging Clustering Algorithms on Connected Components for Entity Resolution

F Hoogmoed - 2023 - studenttheses.uu.nl
Entity resolution is a critical task in enhancing data quality and ensuring the reliability of data
analytics, as it involves identifying distinct records in a dataset that correspond to the same …