High-dimensional limit theorems for sgd: Effective dynamics and critical scaling
We study the scaling limits of stochastic gradient descent (SGD) with constant step-size in
the high-dimensional regime. We prove limit theorems for the trajectories of summary …
the high-dimensional regime. We prove limit theorems for the trajectories of summary …
Online stochastic gradient descent on non-convex losses from high-dimensional inference
Stochastic gradient descent (SGD) is a popular algorithm for optimization problems arising
in high-dimensional inference tasks. Here one produces an estimator of an unknown …
in high-dimensional inference tasks. Here one produces an estimator of an unknown …
The benefits of reusing batches for gradient descent in two-layer networks: Breaking the curse of information and leap exponents
We investigate the training dynamics of two-layer neural networks when learning multi-index
target functions. We focus on multi-pass gradient descent (GD) that reuses the batches …
target functions. We focus on multi-pass gradient descent (GD) that reuses the batches …
On the impact of overparameterization on the training of a shallow neural network in high dimensions
We study the training dynamics of a shallow neural network with quadratic activation
functions and quadratic cost in a teacher-student setup. In line with previous works on the …
functions and quadratic cost in a teacher-student setup. In line with previous works on the …
Sudakov–Fernique post-AMP, and a new proof of the local convexity of the TAP free energy
M Celentano - The Annals of Probability, 2024 - projecteuclid.org
We develop an approach for studying the local convexity of a certain class of random
objectives around the iterates of an AMP algorithm. Our approach involves applying the …
objectives around the iterates of an AMP algorithm. Our approach involves applying the …
Statistical limits of dictionary learning: random matrix theory and the spectral replica method
We consider increasingly complex models of matrix denoising and dictionary learning in the
Bayes-optimal setting, in the challenging regime where the matrices to infer have a rank …
Bayes-optimal setting, in the challenging regime where the matrices to infer have a rank …
Rethinking Mean-Field Glassy Dynamics and Its Relation with the Energy Landscape: The Surprising Case of the Spherical Mixed -Spin Model
The spherical p-spin model is a fundamental model in statistical mechanics of a disordered
system with a random first-order transition. The dynamics of this model is interesting both for …
system with a random first-order transition. The dynamics of this model is interesting both for …
Quantitative propagation of chaos for SGD in wide neural networks
In this paper, we investigate the limiting behavior of a continuous-time counterpart of the
Stochastic Gradient Descent (SGD) algorithm applied to two-layer overparameterized neural …
Stochastic Gradient Descent (SGD) algorithm applied to two-layer overparameterized neural …
Landscape complexity for the empirical risk of generalized linear models
We present a method to obtain the average and the typical value of the number of critical
points of the empirical risk landscape for generalized linear estimation problems and …
points of the empirical risk landscape for generalized linear estimation problems and …
Random tensor theory for tensor decomposition
We propose a new framework for tensor decomposition based on trace invariants, which are
particular cases of tensor networks. In general, tensor networks are diagrams/graphs that …
particular cases of tensor networks. In general, tensor networks are diagrams/graphs that …