A farewell to the bias-variance tradeoff? an overview of the theory of overparameterized machine learning
The rapid recent progress in machine learning (ML) has raised a number of scientific
questions that challenge the longstanding dogma of the field. One of the most important …
questions that challenge the longstanding dogma of the field. One of the most important …
The power of preconditioning in overparameterized low-rank matrix sensing
Abstract We propose $\textsf {ScaledGD ($\lambda $)} $, a preconditioned gradient descent
method to tackle the low-rank matrix sensing problem when the true rank is unknown, and …
method to tackle the low-rank matrix sensing problem when the true rank is unknown, and …
Preconditioned gradient descent for over-parameterized nonconvex matrix factorization
In practical instances of nonconvex matrix factorization, the rank of the true solution $
r^{\star} $ is often unknown, so the rank $ r $ of the model can be over-specified as $ r> …
r^{\star} $ is often unknown, so the rank $ r $ of the model can be over-specified as $ r> …
Global convergence of sub-gradient method for robust matrix recovery: Small initialization, noisy measurements, and over-parameterization
In this work, we study the performance of sub-gradient method (SubGM) on a natural
nonconvex and nonsmooth formulation of low-rank matrix recovery with ℓ1-loss, where the …
nonconvex and nonsmooth formulation of low-rank matrix recovery with ℓ1-loss, where the …
Algorithmic regularization in model-free overparametrized asymmetric matrix factorization
We study the asymmetric matrix factorization problem under a natural nonconvex
formulation with arbitrary overparametrization. The model-free setting is considered, with …
formulation with arbitrary overparametrization. The model-free setting is considered, with …
Improved global guarantees for the nonconvex burer–monteiro factorization via rank overparameterization
RY Zhang - Mathematical Programming, 2024 - Springer
We consider minimizing a twice-differentiable, L-smooth, and\(\mu\)-strongly convex
objective\(\phi\) over an\(n\times n\) positive semidefinite matrix\(M\succeq 0\), under the …
objective\(\phi\) over an\(n\times n\) positive semidefinite matrix\(M\succeq 0\), under the …
Tensor-on-tensor regression: Riemannian optimization, over-parameterization, statistical-computational gap and their interplay
Tensor-on-tensor regression: Riemannian optimization, over-parameterization, statistical-computational
gap and their interplay Page 1 The Annals of Statistics 2024, Vol. 52, No. 6, 2583–2612 …
gap and their interplay Page 1 The Annals of Statistics 2024, Vol. 52, No. 6, 2583–2612 …
Preconditioned Gradient Descent for Overparameterized Nonconvex Burer--Monteiro Factorization with Global Optimality Certification
We consider using gradient descent to minimize the nonconvex function f (X)= ϕ (XX T) over
an n× r factor matrix X, in which ϕ is an underlying smooth convex cost function defined over …
an n× r factor matrix X, in which ϕ is an underlying smooth convex cost function defined over …
Rank overspecified robust matrix recovery: Subgradient method and exact recovery
We study the robust recovery of a low-rank matrix from sparsely and grossly corrupted
Gaussian measurements, with no prior knowledge on the intrinsic rank. We consider the …
Gaussian measurements, with no prior knowledge on the intrinsic rank. We consider the …
Algorithmic regularization in tensor optimization: towards a lifted approach in matrix sensing
Gradient descent (GD) is crucial for generalization in machine learning models, as it induces
implicit regularization, promoting compact representations. In this work, we examine the role …
implicit regularization, promoting compact representations. In this work, we examine the role …