Neural networks efficiently learn low-dimensional representations with sgd

A Mousavi-Hosseini, S Park, M Girotti… - arxiv preprint arxiv …, 2022‏ - arxiv.org
We study the problem of training a two-layer neural network (NN) of arbitrary width using
stochastic gradient descent (SGD) where the input $\boldsymbol {x}\in\mathbb {R}^ d $ is …

A selective review on statistical methods for massive data computation: distributed computing, subsampling, and minibatch techniques

X Li, Y Gao, H Chang, D Huang, Y Ma… - Statistical Theory and …, 2024‏ - Taylor & Francis
This paper presents a selective review of statistical computation methods for massive data
analysis. A huge amount of statistical methods for massive data computation have been …

Towards a complete analysis of langevin monte carlo: Beyond poincaré inequality

A Mousavi-Hosseini, TK Farghly, Y He… - The Thirty Sixth …, 2023‏ - proceedings.mlr.press
Langevin diffusions are rapidly convergent under appropriate functional inequality
assumptions. Hence, it is natural to expect that with additional smoothness conditions to …

On the convergence of langevin monte carlo: The interplay between tail growth and smoothness

MA Erdogdu, R Hosseinzadeh - Conference on Learning …, 2021‏ - proceedings.mlr.press
We study sampling from a target distribution $\nu_*= e^{-f} $ using the unadjusted Langevin
Monte Carlo (LMC) algorithm. For any potential function $ f $ whose tails behave like …

Convergence rates of stochastic gradient descent under infinite noise variance

H Wang, M Gurbuzbalaban, L Zhu… - Advances in …, 2021‏ - proceedings.neurips.cc
Recent studies have provided both empirical and theoretical evidence illustrating that heavy
tails can emerge in stochastic gradient descent (SGD) in various scenarios. Such heavy tails …

Bias and extrapolation in Markovian linear stochastic approximation with constant stepsizes

D Huo, Y Chen, Q **e - Abstract Proceedings of the 2023 ACM …, 2023‏ - dl.acm.org
We consider Linear Stochastic Approximation (LSA) with constant stepsize and Markovian
data. Viewing the joint process of the data and LSA iterate as a time-homogeneous Markov …

Stochastic multilevel composition optimization algorithms with level-independent convergence rates

K Balasubramanian, S Ghadimi, A Nguyen - SIAM Journal on Optimization, 2022‏ - SIAM
In this paper, we study smooth stochastic multilevel composition optimization problems,
where the objective function is a nested composition of T functions. We assume access to …

Convergence of Langevin Monte Carlo in chi-squared and Rényi divergence

MA Erdogdu, R Hosseinzadeh… - … Conference on Artificial …, 2022‏ - proceedings.mlr.press
We study sampling from a target distribution $\nu_*= e^{-f} $ using the unadjusted Langevin
Monte Carlo (LMC) algorithm when the potential $ f $ satisfies a strong dissipativity condition …

Fractal structure and generalization properties of stochastic optimization algorithms

A Camuto, G Deligiannidis… - Advances in neural …, 2021‏ - proceedings.neurips.cc
Understanding generalization in deep learning has been one of the major challenges in
statistical learning theory over the last decade. While recent work has illustrated that the …

Computing the bias of constant-step stochastic approximation with markovian noise

S Allmeier, N Gast - arxiv preprint arxiv:2405.14285, 2024‏ - arxiv.org
We study stochastic approximation algorithms with Markovian noise and constant step-size
$\alpha $. We develop a method based on infinitesimal generator comparisons to study the …