On the convergence of fedavg on non-iid data

X Li, K Huang, W Yang, S Wang, Z Zhang - arxiv preprint arxiv …, 2019 - arxiv.org
Federated learning enables a large amount of edge computing devices to jointly learn a
model without data sharing. As a leading algorithm in this setting, Federated Averaging …

Gradient sparsification for communication-efficient distributed optimization

J Wangni, J Wang, J Liu… - Advances in Neural …, 2018 - proceedings.neurips.cc
Modern large-scale machine learning applications require stochastic optimization
algorithms to be implemented on distributed computational architectures. A key bottleneck is …

A review of distributed statistical inference

Y Gao, W Liu, H Wang, X Wang, Y Yan… - Statistical Theory and …, 2022 - Taylor & Francis
The rapid emergence of massive datasets in various fields poses a serious challenge to
traditional statistical methods. Meanwhile, it provides opportunities for researchers to …

[BOOK][B] Statistical foundations of data science

J Fan, R Li, CH Zhang, H Zou - 2020 - taylorfrancis.com
Statistical Foundations of Data Science gives a thorough introduction to commonly used
statistical models, contemporary statistical machine learning techniques and algorithms …

[HTML][HTML] Distributed testing and estimation under sparse high dimensional models

H Battey, J Fan, H Liu, J Lu, Z Zhu - Annals of statistics, 2018 - ncbi.nlm.nih.gov
This paper studies hypothesis testing and parameter estimation in the context of the divide-
and-conquer algorithm. In a unified likelihood based framework, we propose new test …

Distributed Computing and Inference for Big Data

L Zhou, Z Gong, P **ang - Annual Review of Statistics and Its …, 2023 - annualreviews.org
Data are distributed across different sites due to computing facility limitations or data privacy
considerations. Conventional centralized methods—those in which all datasets are stored …

Quantile regression under memory constraint

X Chen, W Liu, Y Zhang - 2019 - projecteuclid.org
Quantile regression under memory constraint Page 1 The Annals of Statistics 2019, Vol. 47,
No. 6, 3244–3273 https://doi.org/10.1214/18-AOS1777 © Institute of Mathematical Statistics …

[HTML][HTML] Distributed estimation of principal eigenspaces

J Fan, D Wang, K Wang, Z Zhu - Annals of statistics, 2019 - ncbi.nlm.nih.gov
Principal component analysis (PCA) is fundamental to statistical machine learning. It extracts
latent principal factors that contribute to the most variation of the data. When data are stored …

Inference for multiple heterogeneous networks with a common invariant subspace

J Arroyo, A Athreya, J Cape, G Chen, CE Priebe… - Journal of Machine …, 2021 - jmlr.org
The development of models and methodology for the analysis of data from multiple
heterogeneous networks is of importance both in statistical network theory and across a …

Communication-efficient accurate statistical estimation

J Fan, Y Guo, K Wang - Journal of the American Statistical …, 2023 - Taylor & Francis
When the data are stored in a distributed manner, direct applications of traditional statistical
inference procedures are often prohibitive due to communication costs and privacy …