Making ai forget you: Data deletion in machine learning
Intense recent discussions have focused on how to provide individuals with control over
when their data can and cannot be used---the EU's Right To Be Forgotten regulation is an …
when their data can and cannot be used---the EU's Right To Be Forgotten regulation is an …
Federated optimization: Distributed machine learning for on-device intelligence
We introduce a new and increasingly relevant setting for distributed optimization in machine
learning, where the data defining the optimization are unevenly distributed over an …
learning, where the data defining the optimization are unevenly distributed over an …
Acceleration methods
This monograph covers some recent advances in a range of acceleration techniques
frequently used in convex optimization. We first use quadratic optimization problems to …
frequently used in convex optimization. We first use quadratic optimization problems to …
Coordinate descent algorithms
SJ Wright - Mathematical programming, 2015 - Springer
Coordinate descent algorithms solve optimization problems by successively performing
approximate minimization along coordinate directions or coordinate hyperplanes. They have …
approximate minimization along coordinate directions or coordinate hyperplanes. They have …
A proximal stochastic gradient method with progressive variance reduction
We consider the problem of minimizing the sum of two convex functions: one is the average
of a large number of smooth component functions, and the other is a general convex …
of a large number of smooth component functions, and the other is a general convex …
Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm
We improve a recent gurantee of Bach and Moulines on the linear convergence of SGD for
smooth and strongly convex objectives, reducing a quadratic dependence on the strong …
smooth and strongly convex objectives, reducing a quadratic dependence on the strong …
Accelerated gradient descent escapes saddle points faster than gradient descent
Nesterov's accelerated gradient descent (AGD), an instance of the general family of
“momentum methods,” provably achieves faster convergence rate than gradient descent …
“momentum methods,” provably achieves faster convergence rate than gradient descent …
Bipartite matching in nearly-linear time on moderately dense graphs
We present an ̃O(m+n^1.5)-time randomized algorithm for maximum cardinality bipartite
matching and related problems (eg transshipment, negative-weight shortest paths, and …
matching and related problems (eg transshipment, negative-weight shortest paths, and …
Stochastic optimization with importance sampling for regularized loss minimization
Uniform sampling of training data has been commonly used in traditional stochastic
optimization algorithms such as Proximal Stochastic Mirror Descent (prox-SMD) and …
optimization algorithms such as Proximal Stochastic Mirror Descent (prox-SMD) and …
Randomized iterative methods for linear systems
We develop a novel, fundamental, and surprisingly simple randomized iterative method for
solving consistent linear systems. Our method has six different but equivalent interpretations …
solving consistent linear systems. Our method has six different but equivalent interpretations …