Data management for machine learning: A survey
Machine learning (ML) has widespread applications and has revolutionized many
industries, but suffers from several challenges. First, sufficient high-quality training data is …
industries, but suffers from several challenges. First, sufficient high-quality training data is …
Selective data acquisition in the wild for model charging
The lack of sufficient labeled data is a key bottleneck for practitioners in many real-world
supervised machine learning (ML) tasks. In this paper, we study a new problem, namely …
supervised machine learning (ML) tasks. In this paper, we study a new problem, namely …
Human-in-the-loop outlier detection
Outlier detection is critical to a large number of applications from finance fraud detection to
health care. Although numerous approaches have been proposed to automatically detect …
health care. Although numerous approaches have been proposed to automatically detect …
Trustworthy AI-based Performance Diagnosis Systems for Cloud Applications: A Review
Performance diagnosis systems are defined as detecting abnormal performance
phenomena and play a crucial role in cloud applications. An effective performance …
phenomena and play a crucial role in cloud applications. An effective performance …
Contact tracing incentive for COVID-19 and other pandemic diseases from a crowdsourcing perspective
Governments of the world have invested a lot of manpower and material resources to
combat COVID-19 this year. At this moment, the most efficient way that could stop the …
combat COVID-19 this year. At this moment, the most efficient way that could stop the …
Interactive cleaning for progressive visualization through composite questions
In this paper, we study the problem of interactive cleaning for progressive visualization
(ICPV): Given a bad visualization V, it is to obtain a" cleaned" visualization V whose distance …
(ICPV): Given a bad visualization V, it is to obtain a" cleaned" visualization V whose distance …
Interactively discovering and ranking desired tuples by data exploration
Data exploration—the problem of extracting knowledge from database even if we do not
know exactly what we are looking for—is important for data discovery and analysis …
know exactly what we are looking for—is important for data discovery and analysis …
Automatic data acquisition for deep learning
Deep learning (DL) has widespread applications and has revolutionized many industries.
Although automated machine learning (AutoML) can help us away from coding for DL …
Although automated machine learning (AutoML) can help us away from coding for DL …
Hint: harnessing the wisdom of crowds for handling multi-phase tasks
The resourcefulness of crowdsourcing can be used to handle a wide range of complex
macro-tasks, such as travel planning, translation, and software development. Multi-phase …
macro-tasks, such as travel planning, translation, and software development. Multi-phase …
Combining ad hoc text mining and descriptive analytics to investigate public EV charging prices in the United States
Electric vehicle (EV) charging infrastructure is present all over the United States, but
charging prices vary greatly, both in amount and in the methods by which they are assessed …
charging prices vary greatly, both in amount and in the methods by which they are assessed …