Communication-efficient distributed SGD with sketching

N Ivkin, D Rothchild, E Ullah… - Advances in Neural …, 2019 - proceedings.neurips.cc
Large-scale distributed training of neural networks is often limited by network bandwidth,
wherein the communication time overwhelms the local computation time. Motivated by the …

A general-purpose counting filter: Making every bit count

P Pandey, MA Bender, R Johnson, R Patro - Proceedings of the 2017 …, 2017 - dl.acm.org
Approximate Membership Query (AMQ) data structures, such as the Bloom filter, quotient
filter, and cuckoo filter, have found numerous applications in databases, storage systems …

Coresets and sketches

JM Phillips - Handbook of discrete and computational geometry, 2017 - taylorfrancis.com
Geometric data summarization has become an essential tool in both geometric
approximation algorithms and where geometry intersects with big data problems. In linear or …

Aggregation and degradation in {JetStream}: Streaming analytics in the wide area

A Rabkin, M Arye, S Sen, VS Pai… - 11th USENIX Symposium …, 2014 - usenix.org
We present JetStream, a system that allows real-time analysis of large, widely-distributed
changing data sets. Traditional approaches to distributed analytics require users to specify …

Frequent directions: Simple and deterministic matrix sketching

M Ghashami, E Liberty, JM Phillips… - SIAM Journal on …, 2016 - SIAM
We describe a new algorithm called FrequentDirections for deterministic matrix sketching in
the row-update model. The algorithm is presented an arbitrary input matrix A ∈ R^ n * d one …

Optimal quantile approximation in streams

Z Karnin, K Lang, E Liberty - 2016 ieee 57th annual symposium …, 2016 - ieeexplore.ieee.org
This paper resolves one of the longest standing basic problems in the streaming
computational model. Namely, optimal construction of quantile sketches. An ε approximate …

Continual learning in practice

T Diethe, T Borchert, E Thereska, B Balle… - arxiv preprint arxiv …, 2019 - arxiv.org
This paper describes a reference architecture for self-maintaining systems that can learn
continually, as data arrives. In environments where data evolves, we need architectures that …

Composable core-sets for diversity and coverage maximization

P Indyk, S Mahabadi, M Mahdian… - Proceedings of the 33rd …, 2014 - dl.acm.org
In this paper we consider efficient construction of" composable core-sets" for basic diversity
and coverage maximization problems. A core-set for a point-set in a metric space is a subset …

Ddsketch: A fast and fully-mergeable quantile sketch with relative-error guarantees

C Masson, JE Rim, HK Lee - arxiv preprint arxiv:1908.10693, 2019 - arxiv.org
Summary statistics such as the mean and variance are easily maintained for large,
distributed data streams, but order statistics (ie, sample quantiles) can only be approximately …

Randomized composable core-sets for distributed submodular maximization

V Mirrokni, M Zadimoghaddam - … of the forty-seventh annual ACM …, 2015 - dl.acm.org
An effective technique for solving optimization problems over massive data sets is to
partition the data into smaller pieces, solve the problem on each piece and compute a …