Exact expressions for double descent and implicit regularization via surrogate random design

M Derezinski, FT Liang… - Advances in neural …, 2020 - proceedings.neurips.cc
Double descent refers to the phase transition that is exhibited by the generalization error of
unregularized learning models when varying the ratio between the number of parameters …

Recent and upcoming developments in randomized numerical linear algebra for machine learning

M Dereziński, MW Mahoney - Proceedings of the 30th ACM SIGKDD …, 2024 - dl.acm.org
Large matrices arise in many machine learning and data analysis applications, including as
representations of datasets, graphs, model weights, and first and second-order derivatives …

Newton-LESS: Sparsification without trade-offs for the sketched Newton update

M Derezinski, J Lacotte, M Pilanci… - Advances in Neural …, 2021 - proceedings.neurips.cc
In second-order optimization, a potential bottleneck can be computing the Hessian matrix of
the optimized function at every iteration. Randomized sketching has emerged as a powerful …

Bagging in overparameterized learning: Risk characterization and risk monotonization

P Patil, JH Du, AK Kuchibhotla - Journal of Machine Learning Research, 2023 - jmlr.org
Bagging is a commonly used ensemble technique in statistics and machine learning to
improve the performance of prediction procedures. In this paper, we study the prediction risk …

Hessian averaging in stochastic Newton methods achieves superlinear convergence

S Na, M Dereziński, MW Mahoney - Mathematical Programming, 2023 - Springer
We consider minimizing a smooth and strongly convex objective function using a stochastic
Newton method. At each iteration, the algorithm is given an oracle access to a stochastic …

Asymptotics of the sketched pseudoinverse

D LeJeune, P Patil, H Javadi, RG Baraniuk… - SIAM Journal on …, 2024 - SIAM
We take a random matrix theory approach to random sketching and show an asymptotic first-
order equivalence of the regularized sketched pseudoinverse of a positive semidefinite …

Precise expressions for random projections: Low-rank approximation and randomized Newton

M Derezinski, FT Liang, Z Liao… - Advances in Neural …, 2020 - proceedings.neurips.cc
It is often desirable to reduce the dimensionality of a large dataset by projecting it onto a low-
dimensional subspace. Matrix sketching has emerged as a powerful technique for …

Sampling from a k-DPP without looking at all items

D Calandriello, M Derezinski… - Advances in Neural …, 2020 - proceedings.neurips.cc
Determinantal point processes (DPPs) are a useful probabilistic model for selecting a small
diverse subset out of a large collection of items, with applications in summarization …

Adaptive newton sketch: Linear-time optimization with quadratic convergence and effective hessian dimensionality

J Lacotte, Y Wang, M Pilanci - International Conference on …, 2021 - proceedings.mlr.press
We propose a randomized algorithm with quadratic convergence rate for convex
optimization problems with a self-concordant, composite, strongly convex objective function …

Effective dimension adaptive sketching methods for faster regularized least-squares optimization

J Lacotte, M Pilanci - Advances in neural information …, 2020 - proceedings.neurips.cc
We propose a new randomized algorithm for solving L2-regularized least-squares problems
based on sketching. We consider two of the most popular random embeddings, namely …