Generalization error rates in kernel regression: The crossover from the noiseless to noisy regime

H Cui, B Loureiro, F Krzakala… - Advances in Neural …, 2021 - proceedings.neurips.cc
In this manuscript we consider Kernel Ridge Regression (KRR) under the Gaussian design.
Exponents for the decay of the excess generalization error of KRR have been reported in …

Benign overfitting of constant-stepsize sgd for linear regression

D Zou, J Wu, V Braverman, Q Gu… - … on Learning Theory, 2021 - proceedings.mlr.press
There is an increasing realization that algorithmic inductive biases are central in preventing
overfitting; empirically, we often see a benign overfitting phenomenon in overparameterized …

Near-interpolators: Rapid norm growth and the trade-off between interpolation and generalization

Y Wang, R Sonthalia, W Hu - International Conference on …, 2024 - proceedings.mlr.press
We study the generalization capability of nearly-interpolating linear regressors: ${\beta} $'s
whose training error $\tau $ is positive but small, ie, below the noise floor. Under a random …

Last iterate risk bounds of sgd with decaying stepsize for overparameterized linear regression

J Wu, D Zou, V Braverman, Q Gu… - … on Machine Learning, 2022 - proceedings.mlr.press
Stochastic gradient descent (SGD) has been shown to generalize well in many deep
learning applications. In practice, one often runs SGD with a geometrically decaying …

Capacity dependent analysis for functional online learning algorithms

X Guo, ZC Guo, L Shi - Applied and Computational Harmonic Analysis, 2023 - Elsevier
This article provides convergence analysis of online stochastic gradient descent algorithms
for functional linear models. Adopting the characterizations of the slope function regularity …

Last iterate convergence of SGD for Least-Squares in the Interpolation regime.

AV Varre, L Pillaud-Vivien… - Advances in Neural …, 2021 - proceedings.neurips.cc
Motivated by the recent successes of neural networks that have the ability to fit the data
perfectly\emph {and} generalize well, we study the noiseless model in the fundamental least …

Statistical optimality of divide and conquer kernel-based functional linear regression

J Liu, L Shi - arxiv preprint arxiv:2211.10968, 2022 - arxiv.org
Previous analysis of regularized functional linear regression in a reproducing kernel Hilbert
space (RKHS) typically requires the target function to be contained in this kernel space. This …

Continuized accelerations of deterministic and stochastic gradient descents, and of gossip algorithms

M Even, R Berthier, F Bach… - Advances in …, 2021 - proceedings.neurips.cc
Abstract We introduce the``continuized''Nesterov acceleration, a close variant of Nesterov
acceleration whose variables are indexed by a continuous time parameter. The two …

Provable generalization of overparameterized meta-learning trained with sgd

Y Huang, Y Liang, L Huang - Advances in Neural …, 2022 - proceedings.neurips.cc
Despite the empirical success of deep meta-learning, theoretical understanding of
overparameterized meta-learning is still limited. This paper studies the generalization of a …

Kernel methods for causal functions: dose, heterogeneous and incremental response curves

R Singh, L Xu, A Gretton - Biometrika, 2024 - academic.oup.com
We propose estimators based on kernel ridge regression for nonparametric causal functions
such as dose, heterogeneous and incremental response curves. The treatment and …