[HTML][HTML] Construction of knowledge graphs: Current state and challenges

M Hofer, D Obraczka, A Saeedi, H Köpcke, E Rahm - Information, 2024 - mdpi.com
With Knowledge Graphs (KGs) at the center of numerous applications such as recommender
systems and question-answering, the need for generalized pipelines to construct and …

Time series data cleaning: A survey

X Wang, C Wang - Ieee Access, 2019 - ieeexplore.ieee.org
Errors are prevalent in time series data, which is particularly common in the industrial field.
Data with errors could not be stored in the database, which results in the loss of data assets …

Construction of knowledge graphs: State and challenges

M Hofer, D Obraczka, A Saeedi, H Köpcke… - arxiv preprint arxiv …, 2023 - arxiv.org
With knowledge graphs (KGs) at the center of numerous applications such as recommender
systems and question answering, the need for generalized pipelines to construct and …

Spatial data quality in the Internet of Things: Management, exploitation, and prospects

H Li, H Lu, CS Jensen, B Tang… - ACM Computing Surveys …, 2022 - dl.acm.org
With the continued deployment of the Internet of Things (IoT), increasing volumes of devices
are being deployed that emit massive spatially referenced data. Due in part to the dynamic …

Kgclean: An embedding powered knowledge graph cleaning framework

C Ge, Y Gao, H Weng, C Zhang, X Miao… - arxiv preprint arxiv …, 2020 - arxiv.org
The quality assurance of the knowledge graph is a prerequisite for various knowledge-
driven applications. We propose KGClean, a novel cleaning framework powered by …

A method of cleaning data from IoT devices in Big data systems

J Bobulski, M Kubanek - … Conference on Big Data (Big Data), 2022 - ieeexplore.ieee.org
When retrieving data from IoT devices, errors in time series data often occur due to
interference. Data with errors cannot be processed or saved in databases and warehouses …

Entity Matching with AUC-Based Fairness

S Nilforoushan, Q Wu, M Milani - 2022 IEEE International …, 2022 - ieeexplore.ieee.org
The research on fair machine learning (ML) has been growing due to the high demand for
unbiased and fair ML models for objective decision-making. Most of this research has been …

GPSClean: A framework for cleaning and repairing GPS data

C Fang, F Wang, B Yao, J Xu - ACM Transactions on Intelligent Systems …, 2022 - dl.acm.org
The rise of GPS-equipped mobile devices has led to the emergence of big trajectory data.
The collected raw data usually contain errors and anomalies information caused by device …

Evaluation of duplicate detection algorithms: From quality measures to test data generation

F Panse, F Naumann - 2021 IEEE 37th International …, 2021 - ieeexplore.ieee.org
Duplicate detection identifies multiple records in a dataset that represent the same real-
world object. Many such approaches exist, both in research and in industry. To investigate …

[PDF][PDF] Detecting Stale Data in Wikipedia Infoboxes.

M Barth, T Bleidt, M Büßemeyer, F Heseding… - EDBT, 2023 - openproceedings.org
Today's fast-paced society is increasingly reliant on correct and up-to-date data. Wikipedia is
the world's most popular source of knowledge, and its infoboxes contain concise semi …