A farewell to the bias-variance tradeoff? an overview of the theory of overparameterized machine learning

Y Dar, V Muthukumar, RG Baraniuk - arxiv preprint arxiv:2109.02355, 2021 - arxiv.org
The rapid recent progress in machine learning (ML) has raised a number of scientific
questions that challenge the longstanding dogma of the field. One of the most important …

The power of preconditioning in overparameterized low-rank matrix sensing

X Xu, Y Shen, Y Chi, C Ma - International Conference on …, 2023 - proceedings.mlr.press
Abstract We propose $\textsf {ScaledGD ($\lambda $)} $, a preconditioned gradient descent
method to tackle the low-rank matrix sensing problem when the true rank is unknown, and …

Preconditioned gradient descent for over-parameterized nonconvex matrix factorization

J Zhang, S Fattahi, RY Zhang - Advances in Neural …, 2021 - proceedings.neurips.cc
In practical instances of nonconvex matrix factorization, the rank of the true solution $
r^{\star} $ is often unknown, so the rank $ r $ of the model can be over-specified as $ r> …

Global convergence of sub-gradient method for robust matrix recovery: Small initialization, noisy measurements, and over-parameterization

J Ma, S Fattahi - Journal of Machine Learning Research, 2023 - jmlr.org
In this work, we study the performance of sub-gradient method (SubGM) on a natural
nonconvex and nonsmooth formulation of low-rank matrix recovery with ℓ1-loss, where the …

Algorithmic regularization in model-free overparametrized asymmetric matrix factorization

L Jiang, Y Chen, L Ding - SIAM Journal on Mathematics of Data Science, 2023 - SIAM
We study the asymmetric matrix factorization problem under a natural nonconvex
formulation with arbitrary overparametrization. The model-free setting is considered, with …

Improved global guarantees for the nonconvex burer–monteiro factorization via rank overparameterization

RY Zhang - Mathematical Programming, 2024 - Springer
We consider minimizing a twice-differentiable, L-smooth, and\(\mu\)-strongly convex
objective\(\phi\) over an\(n\times n\) positive semidefinite matrix\(M\succeq 0\), under the …

Tensor-on-tensor regression: Riemannian optimization, over-parameterization, statistical-computational gap and their interplay

Y Luo, AR Zhang - The Annals of Statistics, 2024 - projecteuclid.org
Tensor-on-tensor regression: Riemannian optimization, over-parameterization, statistical-computational
gap and their interplay Page 1 The Annals of Statistics 2024, Vol. 52, No. 6, 2583–2612 …

Preconditioned Gradient Descent for Overparameterized Nonconvex Burer--Monteiro Factorization with Global Optimality Certification

G Zhang, S Fattahi, RY Zhang - Journal of Machine Learning Research, 2023 - jmlr.org
We consider using gradient descent to minimize the nonconvex function f (X)= ϕ (XX T) over
an n× r factor matrix X, in which ϕ is an underlying smooth convex cost function defined over …

Rank overspecified robust matrix recovery: Subgradient method and exact recovery

L Ding, L Jiang, Y Chen, Q Qu, Z Zhu - arxiv preprint arxiv:2109.11154, 2021 - arxiv.org
We study the robust recovery of a low-rank matrix from sparsely and grossly corrupted
Gaussian measurements, with no prior knowledge on the intrinsic rank. We consider the …

Algorithmic regularization in tensor optimization: towards a lifted approach in matrix sensing

Z Ma, J Lavaei, S Sojoudi - Advances in Neural Information …, 2024 - proceedings.neurips.cc
Gradient descent (GD) is crucial for generalization in machine learning models, as it induces
implicit regularization, promoting compact representations. In this work, we examine the role …