Analysis of {Large-Scale}{Multi-Tenant}{GPU} clusters for {DNN} training workloads
With widespread advances in machine learning, a number of large enterprises are
beginning to incorporate machine learning models across a number of products. These …
beginning to incorporate machine learning models across a number of products. These …
Live video analytics at scale with approximation and {Delay-Tolerance}
Video cameras are pervasively deployed for security and smart city scenarios, with millions
of them in large cities worldwide. Achieving the potential of these cameras requires …
of them in large cities worldwide. Achieving the potential of these cameras requires …
BlinkDB: queries with bounded errors and bounded response times on very large data
In this paper, we present BlinkDB, a massively parallel, approximate query engine for
running interactive SQL queries on large volumes of data. BlinkDB allows users to trade-off …
running interactive SQL queries on large volumes of data. BlinkDB allows users to trade-off …
Synopses for massive data: Samples, histograms, wavelets, sketches
Abstract Methods for Approximate Query Processing (AQP) are essential for dealing with
massive data. They are often the only means of providing interactive response times when …
massive data. They are often the only means of providing interactive response times when …
Data-stream sampling: Basic techniques and results
PJ Haas - Data Stream Management: Processing High-Speed …, 2016 - Springer
Perhaps the most basic synopsis of a data stream is a sample of elements from the stream. A
key benefit of such a sample is its flexibility: the sample can serve as input to a wide variety …
key benefit of such a sample is its flexibility: the sample can serve as input to a wide variety …
[PDF][PDF] MapReduce online.
MapReduce is a popular framework for data-intensive distributed computing of batch jobs.
To simplify fault tolerance, many implementations of MapReduce materialize the entire …
To simplify fault tolerance, many implementations of MapReduce materialize the entire …
Approximate query processing: No silver bullet
In this paper, we reflect on the state of the art of Approximate Query Processing. Although
much technical progress has been made in this area of research, we are yet to see its impact …
much technical progress has been made in this area of research, we are yet to see its impact …
Wander join: Online aggregation via random walks
Joins are expensive, and online aggregation over joins was proposed to mitigate the cost,
which offers users a nice and flexible tradeoff between query efficiency and accuracy in a …
which offers users a nice and flexible tradeoff between query efficiency and accuracy in a …
Slaq: quality-driven scheduling for distributed machine learning
Training machine learning (ML) models with large datasets can incur significant resource
contention on shared clusters. This training typically involves many iterations that continually …
contention on shared clusters. This training typically involves many iterations that continually …
Approximate query processing: What is new and where to go? a survey on approximate query processing
Online analytical processing (OLAP) is a core functionality in database systems. The
performance of OLAP is crucial to make online decisions in many applications. However, it is …
performance of OLAP is crucial to make online decisions in many applications. However, it is …