Acceleration methods

A d'Aspremont, D Scieur, A Taylor - Foundations and Trends® …, 2021 - nowpublishers.com
This monograph covers some recent advances in a range of acceleration techniques
frequently used in convex optimization. We first use quadratic optimization problems to …

Sgd in the large: Average-case analysis, asymptotics, and stepsize criticality

C Paquette, K Lee, F Pedregosa… - … on Learning Theory, 2021 - proceedings.mlr.press
We propose a new framework, inspired by random matrix theory, for analyzing the dynamics
of stochastic gradient descent (SGD) when both number of samples and dimensions are …

Halting time is predictable for large models: A universality property and average-case analysis

C Paquette, B van Merriënboer, E Paquette… - Foundations of …, 2023 - Springer
Average-case analysis computes the complexity of an algorithm averaged over all possible
inputs. Compared to worst-case analysis, it is more representative of the typical behavior of …

Acceleration through spectral density estimation

F Pedregosa, D Scieur - International Conference on …, 2020 - proceedings.mlr.press
We develop a framework for the average-case analysis of random quadratic problems and
derive algorithms that are optimal under this analysis. This yields a new class of methods …

Universal average-case optimality of Polyak momentum

D Scieur, F Pedregosa - International conference on …, 2020 - proceedings.mlr.press
Polyak momentum (PM), also known as the heavy-ball method, is a widely used optimization
method that enjoys an asymptotic optimal worst-case complexity on quadratic objectives …

Debiasing distributed second order optimization with surrogate sketching and scaled regularization

M Derezinski, B Bartan, M Pilanci… - Advances in Neural …, 2020 - proceedings.neurips.cc
In distributed second order optimization, a standard strategy is to average many local
estimates, each of which is based on a small sketch or batch of the data. However, the local …

Effective dimension adaptive sketching methods for faster regularized least-squares optimization

J Lacotte, M Pilanci - Advances in neural information …, 2020 - proceedings.neurips.cc
We propose a new randomized algorithm for solving L2-regularized least-squares problems
based on sketching. We consider two of the most popular random embeddings, namely …

Conformal frequency estimation using discrete sketched data with coverage for distinct queries

M Sesia, S Favaro, E Dobriban - Journal of Machine Learning Research, 2023 - jmlr.org
This paper develops conformal inference methods to construct a confidence interval for the
frequency of a queried object in a very large discrete data set, based on a sketch with a …

Training quantized neural networks to global optimality via semidefinite programming

B Bartan, M Pilanci - International Conference on Machine …, 2021 - proceedings.mlr.press
Neural networks (NNs) have been extremely successful across many tasks in machine
learning. Quantization of NN weights has become an important topic due to its impact on …

Faster least squares optimization

J Lacotte, M Pilanci - arxiv preprint arxiv:1911.02675, 2019 - arxiv.org
We investigate iterative methods with randomized preconditioners for solving
overdetermined least-squares problems, where the preconditioners are based on a random …