[HTML][HTML] Machine learning for advanced energy materials
The screening of advanced materials coupled with the modeling of their quantitative
structural-activity relationships has recently become one of the hot and trending topics in …
structural-activity relationships has recently become one of the hot and trending topics in …
A review on design inspired subsampling for big data
J Yu, M Ai, Z Ye - Statistical Papers, 2024 - Springer
Subsampling focuses on selecting a subsample that can efficiently sketch the information of
the original data in terms of statistical inference. It provides a powerful tool in big data …
the original data in terms of statistical inference. It provides a powerful tool in big data …
Information-based optimal subdata selection for big data linear regression
Extraordinary amounts of data are being produced in many branches of science. Proven
statistical methods are no longer applicable with extraordinary large datasets due to …
statistical methods are no longer applicable with extraordinary large datasets due to …
Granular ball sampling for noisy label classification or imbalanced classification
This article presents a general sampling method, called granular-ball sampling (GBS), for
classification problems by introducing the idea of granular computing. The GBS method …
classification problems by introducing the idea of granular computing. The GBS method …
Communication-efficient surrogate quantile regression for non-randomly distributed system
K Wang, B Zhang, F Alenezi, S Li - Information sciences, 2022 - Elsevier
Distributed system has been widely used to solve massive data analysis tasks. This article
targets on quantile regression on distributed system with non-randomly distributed massive …
targets on quantile regression on distributed system with non-randomly distributed massive …
Optimal subsampling for quantile regression in big data
H Wang, Y Ma - Biometrika, 2021 - academic.oup.com
We investigate optimal subsampling for quantile regression. We derive the asymptotic
distribution of a general subsampling estimator and then derive two versions of optimal …
distribution of a general subsampling estimator and then derive two versions of optimal …
Workshop report on basic research needs for scientific machine learning: Core technologies for artificial intelligence
Scientific Machine Learning (SciML) and Artificial Intelligence (AI) will have broad use and
transformative effects across the Department of Energy. Accordingly, the January 2018 Basic …
transformative effects across the Department of Energy. Accordingly, the January 2018 Basic …
Optimal distributed subsampling for maximum quasi-likelihood estimators with massive data
Nonuniform subsampling methods are effective to reduce computational burden and
maintain estimation efficiency for massive data. Existing methods mostly focus on …
maintain estimation efficiency for massive data. Existing methods mostly focus on …
Optimal subsampling algorithms for big data regressions
In order to quickly approximate maximum likelihood estimators from massive data, this study
examines the optimal subsampling method under the A-optimality criterion (OSMAC) for …
examines the optimal subsampling method under the A-optimality criterion (OSMAC) for …
More efficient estimation for logistic regression with optimal subsamples
HY Wang - Journal of machine learning research, 2019 - jmlr.org
In this paper, we propose improved estimation method for logistic regression based on
subsamples taken according the optimal subsampling probabilities developed in Wang et …
subsamples taken according the optimal subsampling probabilities developed in Wang et …