Generalization error rates in kernel regression: The crossover from the noiseless to noisy regime
In this manuscript we consider Kernel Ridge Regression (KRR) under the Gaussian design.
Exponents for the decay of the excess generalization error of KRR have been reported in …
Exponents for the decay of the excess generalization error of KRR have been reported in …
Benign overfitting of constant-stepsize sgd for linear regression
There is an increasing realization that algorithmic inductive biases are central in preventing
overfitting; empirically, we often see a benign overfitting phenomenon in overparameterized …
overfitting; empirically, we often see a benign overfitting phenomenon in overparameterized …
Near-interpolators: Rapid norm growth and the trade-off between interpolation and generalization
We study the generalization capability of nearly-interpolating linear regressors: ${\beta} $'s
whose training error $\tau $ is positive but small, ie, below the noise floor. Under a random …
whose training error $\tau $ is positive but small, ie, below the noise floor. Under a random …
Last iterate risk bounds of sgd with decaying stepsize for overparameterized linear regression
Stochastic gradient descent (SGD) has been shown to generalize well in many deep
learning applications. In practice, one often runs SGD with a geometrically decaying …
learning applications. In practice, one often runs SGD with a geometrically decaying …
Capacity dependent analysis for functional online learning algorithms
X Guo, ZC Guo, L Shi - Applied and Computational Harmonic Analysis, 2023 - Elsevier
This article provides convergence analysis of online stochastic gradient descent algorithms
for functional linear models. Adopting the characterizations of the slope function regularity …
for functional linear models. Adopting the characterizations of the slope function regularity …
Last iterate convergence of SGD for Least-Squares in the Interpolation regime.
Motivated by the recent successes of neural networks that have the ability to fit the data
perfectly\emph {and} generalize well, we study the noiseless model in the fundamental least …
perfectly\emph {and} generalize well, we study the noiseless model in the fundamental least …
Statistical optimality of divide and conquer kernel-based functional linear regression
J Liu, L Shi - arxiv preprint arxiv:2211.10968, 2022 - arxiv.org
Previous analysis of regularized functional linear regression in a reproducing kernel Hilbert
space (RKHS) typically requires the target function to be contained in this kernel space. This …
space (RKHS) typically requires the target function to be contained in this kernel space. This …
Continuized accelerations of deterministic and stochastic gradient descents, and of gossip algorithms
Abstract We introduce the``continuized''Nesterov acceleration, a close variant of Nesterov
acceleration whose variables are indexed by a continuous time parameter. The two …
acceleration whose variables are indexed by a continuous time parameter. The two …
Provable generalization of overparameterized meta-learning trained with sgd
Despite the empirical success of deep meta-learning, theoretical understanding of
overparameterized meta-learning is still limited. This paper studies the generalization of a …
overparameterized meta-learning is still limited. This paper studies the generalization of a …
Kernel methods for causal functions: dose, heterogeneous and incremental response curves
We propose estimators based on kernel ridge regression for nonparametric causal functions
such as dose, heterogeneous and incremental response curves. The treatment and …
such as dose, heterogeneous and incremental response curves. The treatment and …