High-dimensional analysis of double descent for linear regression with random projections

F Bach - SIAM Journal on Mathematics of Data Science, 2024 - SIAM
We consider linear regression problems with a varying number of random projections,
where we provably exhibit a double descent curve for a fixed prediction problem, with a high …

Uniform consistency of cross-validation estimators for high-dimensional ridge regression

P Patil, Y Wei, A Rinaldo… - … Conference on Artificial …, 2021 - proceedings.mlr.press
We examine generalized and leave-one-out cross-validation for ridge regression in a
proportional asymptotic framework where the dimension of the feature space grows …

Using artificial intelligence to rapidly identify microplastics pollution and predict microplastics environmental behaviors

B Hu, Y Dai, H Zhou, Y Sun, H Yu, Y Dai… - Journal of Hazardous …, 2024 - Elsevier
With the massive release of microplastics (MPs) into the environment, research related to
MPs is advancing rapidly. Effective research methods are necessary to identify the chemical …

Centralized and Federated Models for the Analysis of Clinical Data

R Li, JD Romano, Y Chen… - Annual Review of …, 2024 - annualreviews.org
The progress of precision medicine research hinges on the gathering and analysis of
extensive and diverse clinical datasets. With the continued expansion of modalities, scales …

Distributed linear regression by averaging

E Dobriban, Y Sheng - 2021 - projecteuclid.org
Distributed linear regression by averaging Page 1 The Annals of Statistics 2021, Vol. 49, No. 2,
918–943 https://doi.org/10.1214/20-AOS1984 © Institute of Mathematical Statistics, 2021 …

What causes the test error? going beyond bias-variance via anova

L Lin, E Dobriban - Journal of Machine Learning Research, 2021 - jmlr.org
Modern machine learning methods are often overparametrized, allowing adaptation to the
data at a fine level. This can seem puzzling; in the worst case, such models do not need to …

Generalized equivalences between subsampling and ridge regularization

P Patil, JH Du - Advances in Neural Information Processing …, 2023 - proceedings.neurips.cc
We establish precise structural and risk equivalences between subsampling and ridge
regularization for ensemble ridge estimators. Specifically, we prove that linear and quadratic …

Ridge regression: Structure, cross-validation, and sketching

S Liu, E Dobriban - arxiv preprint arxiv:1910.02373, 2019 - arxiv.org
We study the following three fundamental problems about ridge regression:(1) what is the
structure of the estimator?(2) how to correctly use cross-validation to choose the …

Bagging in overparameterized learning: Risk characterization and risk monotonization

P Patil, JH Du, AK Kuchibhotla - Journal of Machine Learning Research, 2023 - jmlr.org
Bagging is a commonly used ensemble technique in statistics and machine learning to
improve the performance of prediction procedures. In this paper, we study the prediction risk …

Wonder: Weighted one-shot distributed ridge regression in high dimensions

E Dobriban, Y Sheng - Journal of Machine Learning Research, 2020 - jmlr.org
In many areas, practitioners need to analyze large data sets that challenge conventional
single-machine computing. To scale up data analysis, distributed and parallel computing …