Memorizing without overfitting: Bias, variance, and interpolation in overparameterized models
The bias-variance trade-off is a central concept in supervised learning. In classical statistics,
increasing the complexity of a model (eg, number of parameters) reduces bias but also …
increasing the complexity of a model (eg, number of parameters) reduces bias but also …
Learning through atypical phase transitions in overparameterized neural networks
Current deep neural networks are highly overparameterized (up to billions of connection
weights) and nonlinear. Yet they can fit data almost perfectly through variants of gradient …
weights) and nonlinear. Yet they can fit data almost perfectly through variants of gradient …
Unveiling the structure of wide flat minima in neural networks
The success of deep learning has revealed the application potential of neural networks
across the sciences and opened up fundamental theoretical problems. In particular, the fact …
across the sciences and opened up fundamental theoretical problems. In particular, the fact …
Classification of heavy-tailed features in high dimensions: a superstatistical approach
We characterise the learning of a mixture of two clouds of data points with generic centroids
via empirical risk minimisation in the high dimensional regime, under the assumptions of …
via empirical risk minimisation in the high dimensional regime, under the assumptions of …
Bias-variance decomposition of overparameterized regression with random linear features
In classical statistics, the bias-variance trade-off describes how varying a model's complexity
(eg, number of fit parameters) affects its ability to make accurate predictions. According to …
(eg, number of fit parameters) affects its ability to make accurate predictions. According to …
Typical and atypical solutions in nonconvex neural networks with discrete and continuous weights
We study the binary and continuous negative-margin perceptrons as simple nonconvex
neural network models learning random rules and associations. We analyze the geometry of …
neural network models learning random rules and associations. We analyze the geometry of …
Evolutionary Retrofitting
AfterLearnER (After Learning Evolutionary Retrofitting) consists in applying non-
differentiable optimization, including evolutionary methods, to refine fully-trained machine …
differentiable optimization, including evolutionary methods, to refine fully-trained machine …
High-dimensional manifold of solutions in neural networks: insights from statistical physics
EM Malatesta - arxiv preprint arxiv:2309.09240, 2023 - arxiv.org
In these pedagogic notes I review the statistical mechanics approach to neural networks,
focusing on the paradigmatic example of the perceptron architecture with binary an …
focusing on the paradigmatic example of the perceptron architecture with binary an …
Solvable model for the linear separability of structured data
M Gherardi - Entropy, 2021 - mdpi.com
Linear separability, a core concept in supervised machine learning, refers to whether the
labels of a data set can be captured by the simplest possible machine: a linear classifier. In …
labels of a data set can be captured by the simplest possible machine: a linear classifier. In …
Star-shaped space of solutions of the spherical negative perceptron
Empirical studies on the landscape of neural networks have shown that low-energy
configurations are often found in complex connected structures, where zero-energy paths …
configurations are often found in complex connected structures, where zero-energy paths …