Wander join: Online aggregation via random walks
Joins are expensive, and online aggregation over joins was proposed to mitigate the cost,
which offers users a nice and flexible tradeoff between query efficiency and accuracy in a …
which offers users a nice and flexible tradeoff between query efficiency and accuracy in a …
Approximate query processing: No silver bullet
In this paper, we reflect on the state of the art of Approximate Query Processing. Although
much technical progress has been made in this area of research, we are yet to see its impact …
much technical progress has been made in this area of research, we are yet to see its impact …
Verdictdb: Universalizing approximate query processing
Despite 25 years of research in academia, approximate query processing (AQP) has had
little industrial adoption. One of the major causes of this slow adoption is the reluctance of …
little industrial adoption. One of the major causes of this slow adoption is the reluctance of …
Approximate query processing: What is new and where to go? a survey on approximate query processing
Online analytical processing (OLAP) is a core functionality in database systems. The
performance of OLAP is crucial to make online decisions in many applications. However, it is …
performance of OLAP is crucial to make online decisions in many applications. However, it is …
Towards scalable dataframe systems
Dataframes are a popular abstraction to represent, prepare, and analyze data. Despite the
remarkable success of dataframe libraries in Rand Python, dataframes face performance …
remarkable success of dataframe libraries in Rand Python, dataframes face performance …
Scaling up crowd-sourcing to very large datasets: a case for active learning
Crowd-sourcing has become a popular means of acquiring labeled data for many tasks
where humans are more accurate than computers, such as image tagging, entity resolution …
where humans are more accurate than computers, such as image tagging, entity resolution …
Knowing when you're wrong: building fast and reliable approximate query processing systems
Modern data analytics applications typically process massive amounts of data on clusters of
tens, hundreds, or thousands of machines to support near-real-time decisions. The quantity …
tens, hundreds, or thousands of machines to support near-real-time decisions. The quantity …
Quickr: Lazily approximating complex adhoc queries in bigdata clusters
We present a system that approximates the answer to complex ad-hoc queries in big-data
clusters by injecting samplers on-the-fly and without requiring pre-existing samples …
clusters by injecting samplers on-the-fly and without requiring pre-existing samples …
Visualization-aware sampling for very large databases
Interactive visualizations are crucial in ad hoc data exploration and analysis. However, with
the growing number of massive datasets, generating visualizations in interactive timescales …
the growing number of massive datasets, generating visualizations in interactive timescales …
Sample+ seek: Approximating aggregates with distribution precision guarantee
Data volumes are growing exponentially for our decision-support systems making it
challenging to ensure interactive response time for ad-hoc queries without increasing cost of …
challenging to ensure interactive response time for ad-hoc queries without increasing cost of …