Machine Learning for Refining Knowledge Graphs: A Survey
Knowledge graph (KG) refinement refers to the process of filling in missing information,
removing redundancies, and resolving inconsistencies in KGs. With the growing popularity …
removing redundancies, and resolving inconsistencies in KGs. With the growing popularity …
(Almost) all of entity resolution
Whether the goal is to estimate the number of people that live in a congressional district, to
estimate the number of individuals that have died in an armed conflict, or to disambiguate …
estimate the number of individuals that have died in an armed conflict, or to disambiguate …
Wombat – A Generalization Approach for Automatic Link Discovery
A significant portion of the evolution of Linked Data datasets lies in updating the links to
other datasets. An important challenge when aiming to update these links automatically …
other datasets. An important challenge when aiming to update these links automatically …
A novel ensemble learning approach to unsupervised record linkage
Record linkage is a process of identifying records that refer to the same real-world entity.
Many existing approaches to record linkage apply supervised machine learning techniques …
Many existing approaches to record linkage apply supervised machine learning techniques …
How do Categorical Duplicates Affect ML? A New Benchmark and Empirical Analyses
The tedious grunt work involved in data preparation (prep) before ML reduces ML user
productivity. It is also a roadblock to industrial-scale cloud AutoML workflows that build ML …
productivity. It is also a roadblock to industrial-scale cloud AutoML workflows that build ML …
Towards scalable data discovery
We study the problem of discovering joinable datasets at scale. We approach the problem
from a learning perspective relying on profiles. These are succinct representations that …
from a learning perspective relying on profiles. These are succinct representations that …
An unsupervised instance matcher for schema-free RDF data
This article presents an unsupervised system that performs instance matching between
entities in schema-free Resource Description Framework (RDF) files. Rather than relying on …
entities in schema-free Resource Description Framework (RDF) files. Rather than relying on …
Named entity resolution in personal knowledge graphs
M Kejriwal - arxiv preprint arxiv:2307.12173, 2023 - arxiv.org
Entity Resolution (ER) is the problem of determining when two entities refer to the same
underlying entity. The problem has been studied for over 50 years, and most recently, has …
underlying entity. The problem has been studied for over 50 years, and most recently, has …
Linking and disambiguating entities across heterogeneous RDF graphs
Establishing identity links across RDF datasets is a central and challenging task on the way
to realising the Data Web project. It is well-known that data supplied by different sources can …
to realising the Data Web project. It is well-known that data supplied by different sources can …
Improved similarity assessment and spectral clustering for unsupervised linking of data extracted from bridge inspection reports
K Liu, N El-Gohary - Advanced Engineering Informatics, 2022 - Elsevier
Textual bridge inspection reports are important data sources for supporting data-driven
bridge deterioration prediction and maintenance decision making. Information extraction …
bridge deterioration prediction and maintenance decision making. Information extraction …