Uniform Generalization Bounds on Data-Dependent Hypothesis Sets via PAC-Bayesian Theory on Random Sets

B Dupuis, P Viallard, G Deligiannidis… - Journal of Machine …, 2024 - jmlr.org
We propose data-dependent uniform generalization bounds by approaching the problem
from a PAC-Bayesian perspective. We first apply the PAC-Bayesian framework on “random …

Emergence of heavy tails in homogenized stochastic gradient descent

Z Jiao, M Keller-Ressel - arxiv preprint arxiv:2402.01382, 2024 - arxiv.org
It has repeatedly been observed that loss minimization by stochastic gradient descent (SGD)
leads to heavy-tailed distributions of neural network parameters. Here, we analyze a …

Understanding the Generalization Error of Markov algorithms through Poissonization

B Dupuis, M Haddouche, G Deligiannidis… - arxiv preprint arxiv …, 2025 - arxiv.org
Using continuous-time stochastic differential equation (SDE) proxies to stochastic
optimization algorithms has proven fruitful for understanding their generalization abilities. A …

Algorithmic Stability of Stochastic Gradient Descent with Momentum under Heavy-Tailed Noise

T Dang, M Barsbey, AKM Sonet… - arxiv preprint arxiv …, 2025 - arxiv.org
Understanding the generalization properties of optimization algorithms under heavy-tailed
noise has gained growing attention. However, the existing theoretical results mainly focus …