[HTML][HTML] Construction of knowledge graphs: Current state and challenges

M Hofer, D Obraczka, A Saeedi, H Köpcke, E Rahm - Information, 2024 - mdpi.com
With Knowledge Graphs (KGs) at the center of numerous applications such as recommender
systems and question-answering, the need for generalized pipelines to construct and …

Time series data cleaning: A survey

X Wang, C Wang - Ieee Access, 2019 - ieeexplore.ieee.org
Errors are prevalent in time series data, which is particularly common in the industrial field.
Data with errors could not be stored in the database, which results in the loss of data assets …

Construction of knowledge graphs: State and challenges

M Hofer, D Obraczka, A Saeedi, H Köpcke… - arxiv preprint arxiv …, 2023 - arxiv.org
With knowledge graphs (KGs) at the center of numerous applications such as recommender
systems and question answering, the need for generalized pipelines to construct and …

Spatial data quality in the Internet of Things: Management, exploitation, and prospects

H Li, H Lu, CS Jensen, B Tang… - ACM Computing Surveys …, 2022 - dl.acm.org
With the continued deployment of the Internet of Things (IoT), increasing volumes of devices
are being deployed that emit massive spatially referenced data. Due in part to the dynamic …

Kgclean: An embedding powered knowledge graph cleaning framework

C Ge, Y Gao, H Weng, C Zhang, X Miao… - arxiv preprint arxiv …, 2020 - arxiv.org
The quality assurance of the knowledge graph is a prerequisite for various knowledge-
driven applications. We propose KGClean, a novel cleaning framework powered by …

An automatic near-duplicate video data cleaning method based on a consistent feature hash ring

Y Qin, O Ye, Y Fu - Electronics, 2024 - mdpi.com
In recent decades, with the ever-growing scale of video data, near-duplicate videos continue
to emerge. Data quality issues caused by near-duplicate videos are becoming more and …

Entity matching with auc-based fairness

S Nilforoushan, Q Wu, M Milani - 2022 IEEE International …, 2022 - ieeexplore.ieee.org
The research on fair machine learning (ML) has been growing due to the high demand for
unbiased and fair ML models for objective decision-making. Most of this research has been …

A method of cleaning data from IoT devices in Big data systems

J Bobulski, M Kubanek - … Conference on Big Data (Big Data), 2022 - ieeexplore.ieee.org
When retrieving data from IoT devices, errors in time series data often occur due to
interference. Data with errors cannot be processed or saved in databases and warehouses …

Evaluation of duplicate detection algorithms: From quality measures to test data generation

F Panse, F Naumann - 2021 IEEE 37th International …, 2021 - ieeexplore.ieee.org
Duplicate detection identifies multiple records in a dataset that represent the same real-
world object. Many such approaches exist, both in research and in industry. To investigate …

GPSClean: a framework for cleaning and repairing GPS data

C Fang, F Wang, B Yao, J Xu - ACM Transactions on Intelligent Systems …, 2022 - dl.acm.org
The rise of GPS-equipped mobile devices has led to the emergence of big trajectory data.
The collected raw data usually contain errors and anomalies information caused by device …