String similarity search and join: a survey

M Yu, G Li, D Deng, J Feng - Frontiers of Computer Science, 2016 - Springer
String similarity search and join are two important operations in data cleaning and
integration, which extend traditional exact search and exact join operations in databases by …

Crowd intelligence in AI 2.0 era

W Li, W Wu, H Wang, X Cheng, H Chen, Z Zhou… - Frontiers of Information …, 2017 - Springer
The Internet based cyber-physical world has profoundly changed the information
environment for the development of artificial intelligence (AI), bringing a new wave of AI …

Challenges in data crowdsourcing

H Garcia-Molina, M Joglekar, A Marcus… - … on Knowledge and …, 2016 - ieeexplore.ieee.org
Crowdsourcing refers to solving large problems by involving human workers that solve
component sub-problems or tasks. In data crowdsourcing, the problem involves data …

Cleaning crowdsourced labels using oracles for statistical classification

M Dolatshah, M Teoh, J Wang, J Pei - Proceedings of the VLDB …, 2018 - dl.acm.org
Nowadays, crowdsourcing is being widely used to collect training data for solving
classification problems. However, crowdsourced labels are often noisy, and there is a …

Waldo: An adaptive human interface for crowd entity resolution

V Verroios, H Garcia-Molina… - Proceedings of the 2017 …, 2017 - dl.acm.org
In Entity Resolution, the objective is to find which records of a dataset refer to the same real-
world entity. Crowd Entity Resolution uses humans, in addition to machine algorithms, to …

Effective Bayesian-network-based missing value imputation enhanced by crowdsourcing

C Ye, H Wang, W Lu, J Li - Knowledge-Based Systems, 2020 - Elsevier
During the process of data collection, incompleteness is one of the most serious data quality
problems to deal with. Traditional imputation methods mostly rely on statistics and machine …

[HTML][HTML] Semi-universal geo-crack detection by machine learning

Y Shi, M Ballesio, K Johansen, D Trentman… - Frontiers in Earth …, 2023 - frontiersin.org
Introduction: Cracks are a key feature that determines the structural integrity of rocks, and
their angular distribution can be used to determine the local or regional stress patterns. The …

A survey of uncertain data management

L Li, H Wang, J Li, H Gao - Frontiers of Computer Science, 2020 - Springer
Uncertain data are data with uncertainty information, which exist widely in database
applications. In recent years, uncertainty in data has brought challenges in almost all …

Well log prediction while drilling using seismic impedance with an improved LSTM artificial neural networks

H Wang, Y Xu, S Tang, L Wu, W Cao… - Frontiers in Earth …, 2023 - frontiersin.org
Well log prediction while drilling estimates the rock properties ahead of drilling bits. A
reliable well log prediction is able to assist reservoir engineers in updating the geological …

Crowdsourcing for data management

V Crescenzi, AAA Fernandes, P Merialdo… - … and Information Systems, 2017 - Springer
Crowdsourcing provides access to a pool of human workers who can contribute solutions to
tasks that are challenging for computers. Proposals have been made for the use of …