A farewell to the bias-variance tradeoff? an overview of the theory of overparameterized machine learning
The rapid recent progress in machine learning (ML) has raised a number of scientific
questions that challenge the longstanding dogma of the field. One of the most important …
questions that challenge the longstanding dogma of the field. One of the most important …
The generalization error of random features regression: Precise asymptotics and the double descent curve
Deep learning methods operate in regimes that defy the traditional statistical mindset.
Neural network architectures often contain more parameters than training samples, and are …
Neural network architectures often contain more parameters than training samples, and are …
[HTML][HTML] Surprises in high-dimensional ridgeless least squares interpolation
Interpolators—estimators that achieve zero training error—have attracted growing attention
in machine learning, mainly because state-of-the art neural networks appear to be models of …
in machine learning, mainly because state-of-the art neural networks appear to be models of …
Random features for kernel approximation: A survey on algorithms, theory, and beyond
The class of random features is one of the most popular techniques to speed up kernel
methods in large-scale problems. Related works have been recognized by the NeurIPS Test …
methods in large-scale problems. Related works have been recognized by the NeurIPS Test …
A model of double descent for high-dimensional binary linear classification
We consider a model for logistic regression where only a subset of features of size is used
for training a linear classifier over training samples. The classifier is obtained by running …
for training a linear classifier over training samples. The classifier is obtained by running …
On the Optimal Weighted Regularization in Overparameterized Linear Regression
We consider the linear model $\vy=\vX\vbeta_ {\star}+\vepsilon $ with $\vX\in\mathbb
{R}^{n\times p} $ in the overparameterized regime $ p> n $. We estimate $\vbeta_ {\star} …
{R}^{n\times p} $ in the overparameterized regime $ p> n $. We estimate $\vbeta_ {\star} …
Understanding double descent requires a fine-grained bias-variance decomposition
B Adlam, J Pennington - Advances in neural information …, 2020 - proceedings.neurips.cc
Classical learning theory suggests that the optimal generalization performance of a machine
learning model should occur at an intermediate model complexity, with simpler models …
learning model should occur at an intermediate model complexity, with simpler models …
Optimal regularization can mitigate double descent
Recent empirical and theoretical studies have shown that many learning algorithms--from
linear regression to neural networks--can have test performance that is non-monotonic in …
linear regression to neural networks--can have test performance that is non-monotonic in …
Finite-sample analysis of interpolating linear classifiers in the overparameterized regime
NS Chatterji, PM Long - Journal of Machine Learning Research, 2021 - jmlr.org
We prove bounds on the population risk of the maximum margin algorithm for two-class
linear classification. For linearly separable training data, the maximum margin algorithm has …
linear classification. For linearly separable training data, the maximum margin algorithm has …
Overparameterization improves robustness to covariate shift in high dimensions
N Tripuraneni, B Adlam… - Advances in Neural …, 2021 - proceedings.neurips.cc
A significant obstacle in the development of robust machine learning models is\emph
{covariate shift}, a form of distribution shift that occurs when the input distributions of the …
{covariate shift}, a form of distribution shift that occurs when the input distributions of the …