Wander join: Online aggregation via random walks

F Li, B Wu, K Yi, Z Zhao - … of the 2016 International Conference on …, 2016 - dl.acm.org
Joins are expensive, and online aggregation over joins was proposed to mitigate the cost,
which offers users a nice and flexible tradeoff between query efficiency and accuracy in a …

Approximate query processing: No silver bullet

S Chaudhuri, B Ding, S Kandula - Proceedings of the 2017 ACM …, 2017 - dl.acm.org
In this paper, we reflect on the state of the art of Approximate Query Processing. Although
much technical progress has been made in this area of research, we are yet to see its impact …

Verdictdb: Universalizing approximate query processing

Y Park, B Mozafari, J Sorenson, J Wang - Proceedings of the 2018 …, 2018 - dl.acm.org
Despite 25 years of research in academia, approximate query processing (AQP) has had
little industrial adoption. One of the major causes of this slow adoption is the reluctance of …

Approximate query processing: What is new and where to go? a survey on approximate query processing

K Li, G Li - Data Science and Engineering, 2018 - Springer
Online analytical processing (OLAP) is a core functionality in database systems. The
performance of OLAP is crucial to make online decisions in many applications. However, it is …

Towards scalable dataframe systems

D Petersohn, S Macke, D **n, W Ma, D Lee… - arxiv preprint arxiv …, 2020 - arxiv.org
Dataframes are a popular abstraction to represent, prepare, and analyze data. Despite the
remarkable success of dataframe libraries in Rand Python, dataframes face performance …

Scaling up crowd-sourcing to very large datasets: a case for active learning

B Mozafari, P Sarkar, M Franklin, M Jordan… - Proceedings of the …, 2014 - dl.acm.org
Crowd-sourcing has become a popular means of acquiring labeled data for many tasks
where humans are more accurate than computers, such as image tagging, entity resolution …

Knowing when you're wrong: building fast and reliable approximate query processing systems

S Agarwal, H Milner, A Kleiner, A Talwalkar… - Proceedings of the …, 2014 - dl.acm.org
Modern data analytics applications typically process massive amounts of data on clusters of
tens, hundreds, or thousands of machines to support near-real-time decisions. The quantity …

Quickr: Lazily approximating complex adhoc queries in bigdata clusters

S Kandula, A Shanbhag, A Vitorovic, M Olma… - Proceedings of the …, 2016 - dl.acm.org
We present a system that approximates the answer to complex ad-hoc queries in big-data
clusters by injecting samplers on-the-fly and without requiring pre-existing samples …

Visualization-aware sampling for very large databases

Y Park, M Cafarella, B Mozafari - 2016 IEEE 32nd International …, 2016 - ieeexplore.ieee.org
Interactive visualizations are crucial in ad hoc data exploration and analysis. However, with
the growing number of massive datasets, generating visualizations in interactive timescales …

Sample+ seek: Approximating aggregates with distribution precision guarantee

B Ding, S Huang, S Chaudhuri, K Chakrabarti… - Proceedings of the …, 2016 - dl.acm.org
Data volumes are growing exponentially for our decision-support systems making it
challenging to ensure interactive response time for ad-hoc queries without increasing cost of …