Crowdsourced data management: Overview and challenges

G Li, Y Zheng, J Fan, J Wang, R Cheng - Proceedings of the 2017 ACM …, 2017 - dl.acm.org
Many important data management and analytics tasks cannot be completely addressed by
automated processes. Crowdsourcing is an effective way to harness human cognitive …

Truth inference in crowdsourcing: Is the problem solved?

Y Zheng, G Li, Y Li, C Shan, R Cheng - Proceedings of the VLDB …, 2017 - dl.acm.org
Crowdsourcing has emerged as a novel problem-solving paradigm, which facilitates
addressing problems that are hard for computers, eg, entity resolution and sentiment …

A review and experimental analysis of active learning over crowdsourced data

B Sayin, E Krivosheev, J Yang, A Passerini… - Artificial Intelligence …, 2021 - Springer
Training data creation is increasingly a key bottleneck for develo** machine learning,
especially for deep learning systems. Active learning provides a cost-effective means for …

Crowdsourced data management: A survey

G Li, J Wang, Y Zheng… - IEEE Transactions on …, 2016 - ieeexplore.ieee.org
Any important data management and analytics tasks cannot be completely addressed by
automated processes. These tasks, such as entity resolution, sentiment analysis, and image …

Crowder: Crowdsourcing entity resolution

J Wang, T Kraska, MJ Franklin, J Feng - arxiv preprint arxiv:1208.1927, 2012 - arxiv.org
Entity resolution is central to data integration and data cleaning. Algorithmic approaches
have been improving in quality, but remain far from perfect. Crowdsourcing platforms offer a …

Towards scalable dataframe systems

D Petersohn, S Macke, D **n, W Ma, D Lee… - arxiv preprint arxiv …, 2020 - arxiv.org
Dataframes are a popular abstraction to represent, prepare, and analyze data. Despite the
remarkable success of dataframe libraries in Rand Python, dataframes face performance …

Leveraging transitive relations for crowdsourced joins

J Wang, G Li, T Kraska, MJ Franklin… - Proceedings of the 2013 …, 2013 - dl.acm.org
The development of crowdsourced query processing systems has recently attracted a
significant attention in the database community. A variety of crowdsourced queries have …

Falcon: Scaling up hands-off crowdsourced entity matching to build cloud services

S Das, PS GC, AH Doan, JF Naughton… - Proceedings of the …, 2017 - dl.acm.org
Many works have applied crowdsourcing to entity matching (EM). While promising, these
approaches are limited in that they often require a developer to be in the loop. As such, it is …

Human-in-the-loop outlier detection

C Chai, L Cao, G Li, J Li, Y Luo, S Madden - Proceedings of the 2020 …, 2020 - dl.acm.org
Outlier detection is critical to a large number of applications from finance fraud detection to
health care. Although numerous approaches have been proposed to automatically detect …

Revisiting prompt engineering via declarative crowdsourcing

AG Parameswaran, S Shankar, P Asawa, N Jain… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs) are incredibly powerful at comprehending and generating
data in the form of text, but are brittle and error-prone. There has been an advent of toolkits …