Fetchsgd: Communication-efficient federated learning with sketching
Existing approaches to federated learning suffer from a communication bottleneck as well as
convergence issues due to sparse client participation. In this paper we introduce a novel …
convergence issues due to sparse client participation. In this paper we introduce a novel …
Sharper Bounds for Sensitivity Sampling
In large scale machine learning, random sampling is a popular way to approximate datasets
by a small representative subset of examples. In particular, sensitivity sampling is an …
by a small representative subset of examples. In particular, sensitivity sampling is an …
New frameworks for offline and streaming coreset constructions
A coreset for a set of points is a small subset of weighted points that approximately
preserves important properties of the original set. Specifically, if $ P $ is a set of points, $ Q …
preserves important properties of the original set. Specifically, if $ P $ is a set of points, $ Q …
Tight bounds for adversarially robust streams and sliding windows via difference estimators
In the adversarially robust streaming model, a stream of elements is presented to an
algorithm and is allowed to depend on the output of the algorithm at earlier times during the …
algorithm and is allowed to depend on the output of the algorithm at earlier times during the …
Adversarial robustness of streaming algorithms through importance sampling
Robustness against adversarial attacks has recently been at the forefront of algorithmic
design for machine learning tasks. In the adversarial streaming model, an adversary gives …
design for machine learning tasks. In the adversarial streaming model, an adversary gives …
New subset selection algorithms for low rank approximation: Offline and online
Subset selection for the rank k approximation of an n× d matrix A offers improvements in the
interpretability of matrices, as well as a variety of computational savings. This problem is well …
interpretability of matrices, as well as a variety of computational savings. This problem is well …
Online lewis weight sampling
The seminal work of Cohen and Peng [CP15](STOC 2015) introduced Lewis weight
sampling to the theoretical computer science community, which yields fast row sampling …
sampling to the theoretical computer science community, which yields fast row sampling …
Recent and upcoming developments in randomized numerical linear algebra for machine learning
Large matrices arise in many machine learning and data analysis applications, including as
representations of datasets, graphs, model weights, and first and second-order derivatives …
representations of datasets, graphs, model weights, and first and second-order derivatives …
Streaming Euclidean k-median and k-means with o (log n) Space
We consider the classic Euclidean k-median and k-means objective on data streams, where
the goal is to provide a (1+ε)-approximation to the optimal k-median or k-means solution …
the goal is to provide a (1+ε)-approximation to the optimal k-median or k-means solution …
Near-Optimal -Clustering in the Sliding Window Model
Clustering is an important technique for identifying structural information in large-scale data
analysis, where the underlying dataset may be too large to store. In many applications …
analysis, where the underlying dataset may be too large to store. In many applications …