High-dimensional analysis of double descent for linear regression with random projections
F Bach - SIAM Journal on Mathematics of Data Science, 2024 - SIAM
We consider linear regression problems with a varying number of random projections,
where we provably exhibit a double descent curve for a fixed prediction problem, with a high …
where we provably exhibit a double descent curve for a fixed prediction problem, with a high …
Uniform consistency of cross-validation estimators for high-dimensional ridge regression
We examine generalized and leave-one-out cross-validation for ridge regression in a
proportional asymptotic framework where the dimension of the feature space grows …
proportional asymptotic framework where the dimension of the feature space grows …
Using artificial intelligence to rapidly identify microplastics pollution and predict microplastics environmental behaviors
With the massive release of microplastics (MPs) into the environment, research related to
MPs is advancing rapidly. Effective research methods are necessary to identify the chemical …
MPs is advancing rapidly. Effective research methods are necessary to identify the chemical …
Centralized and Federated Models for the Analysis of Clinical Data
The progress of precision medicine research hinges on the gathering and analysis of
extensive and diverse clinical datasets. With the continued expansion of modalities, scales …
extensive and diverse clinical datasets. With the continued expansion of modalities, scales …
Distributed linear regression by averaging
E Dobriban, Y Sheng - 2021 - projecteuclid.org
Distributed linear regression by averaging Page 1 The Annals of Statistics 2021, Vol. 49, No. 2,
918–943 https://doi.org/10.1214/20-AOS1984 © Institute of Mathematical Statistics, 2021 …
918–943 https://doi.org/10.1214/20-AOS1984 © Institute of Mathematical Statistics, 2021 …
What causes the test error? going beyond bias-variance via anova
Modern machine learning methods are often overparametrized, allowing adaptation to the
data at a fine level. This can seem puzzling; in the worst case, such models do not need to …
data at a fine level. This can seem puzzling; in the worst case, such models do not need to …
Generalized equivalences between subsampling and ridge regularization
We establish precise structural and risk equivalences between subsampling and ridge
regularization for ensemble ridge estimators. Specifically, we prove that linear and quadratic …
regularization for ensemble ridge estimators. Specifically, we prove that linear and quadratic …
Ridge regression: Structure, cross-validation, and sketching
We study the following three fundamental problems about ridge regression:(1) what is the
structure of the estimator?(2) how to correctly use cross-validation to choose the …
structure of the estimator?(2) how to correctly use cross-validation to choose the …
Bagging in overparameterized learning: Risk characterization and risk monotonization
Bagging is a commonly used ensemble technique in statistics and machine learning to
improve the performance of prediction procedures. In this paper, we study the prediction risk …
improve the performance of prediction procedures. In this paper, we study the prediction risk …
Wonder: Weighted one-shot distributed ridge regression in high dimensions
In many areas, practitioners need to analyze large data sets that challenge conventional
single-machine computing. To scale up data analysis, distributed and parallel computing …
single-machine computing. To scale up data analysis, distributed and parallel computing …