Data-splitting improves statistical performance in overparameterized regimes
N Mücke, E Reiss, J Rungenhagen… - International …, 2022 - proceedings.mlr.press
While large training datasets generally offer improvement in model performance, the training
process becomes computationally expensive and time consuming. Distributed learning is a …
process becomes computationally expensive and time consuming. Distributed learning is a …
Distributed learning of conditional quantiles in the reproducing kernel hilbert space
H Lian - Advances in Neural Information Processing …, 2022 - proceedings.neurips.cc
We study distributed learning of nonparametric conditional quantiles with Tikhonov
regularization in a reproducing kernel Hilbert space (RKHS). Although distributed parametric …
regularization in a reproducing kernel Hilbert space (RKHS). Although distributed parametric …
Smoothing splines approximation using Hilbert curve basis selection
Smoothing splines have been used pervasively in nonparametric regressions. However, the
computational burden of smoothing splines is significant when the sample size n is large …
computational burden of smoothing splines is significant when the sample size n is large …
Unbalanced distributed estimation and inference for the precision matrix in Gaussian graphical models
This paper studies the estimation of Gaussian graphical models in the unbalanced
distributed framework. It provides an effective approach when the available machines are of …
distributed framework. It provides an effective approach when the available machines are of …
Least squares model averaging for distributed data
H Zhang, Z Liu, G Zou - Journal of Machine Learning Research, 2023 - jmlr.org
Divide and conquer algorithm is a common strategy applied in big data. Model averaging
has the natural divide-and-conquer feature, but its theory has not been developed in big …
has the natural divide-and-conquer feature, but its theory has not been developed in big …
A study of the impact of COVID‐19 on the Chinese stock market based on a new textual multiple ARMA model
W Xu, Z Fu, H Li, J Huang, W Xu… - Statistical Analysis and …, 2023 - Wiley Online Library
Abstract Coronavirus 2019 (COVID‐19) has caused violent fluctuation in stock markets, and
led to heated discussion in stock forums. The rise and fall of any specific stock is influenced …
led to heated discussion in stock forums. The rise and fall of any specific stock is influenced …
Nonparametric Additive Models for Billion Observations
The nonparametric additive model (NAM) is a widely used nonparametric regression
method. Nevertheless, due to the high computational burden, classic statistical techniques …
method. Nevertheless, due to the high computational burden, classic statistical techniques …
A Simple Divide-and-Conquer-based Distributed Method for the Accelerated Failure Time Model
L Chen, J Su, ATK Wan, Y Zhou - Journal of Computational and …, 2024 - Taylor & Francis
The accelerated failure time (AFT) model is an appealing tool in survival analysis because of
its ease of interpretation, but when there is a large volume of data, fitting an AFT model and …
its ease of interpretation, but when there is a large volume of data, fitting an AFT model and …
Semiparametric estimation for the functional additive hazards model
M Hao, K Liu, W Su, X Zhao - Canadian Journal of Statistics, 2024 - Wiley Online Library
We propose a new functional additive hazards model to investigate the potential effects of
functional and scalar predictors on mortality risks, and develop a penalized least squares …
functional and scalar predictors on mortality risks, and develop a penalized least squares …
Distributed estimation with empirical likelihood
Q Liu, Z Li - Canadian Journal of Statistics, 2023 - Wiley Online Library
With the development of science and technology, massive datasets stored in multiple
machines are increasingly prevalent. It is known that traditional statistical methods may be …
machines are increasingly prevalent. It is known that traditional statistical methods may be …