[HTML][HTML] Machine learning for advanced energy materials

Y Liu, OC Esan, Z Pan, L An - Energy and AI, 2021 - Elsevier
The screening of advanced materials coupled with the modeling of their quantitative
structural-activity relationships has recently become one of the hot and trending topics in …

A review on design inspired subsampling for big data

J Yu, M Ai, Z Ye - Statistical Papers, 2024 - Springer
Subsampling focuses on selecting a subsample that can efficiently sketch the information of
the original data in terms of statistical inference. It provides a powerful tool in big data …

Information-based optimal subdata selection for big data linear regression

HY Wang, M Yang, J Stufken - Journal of the American Statistical …, 2019 - Taylor & Francis
Extraordinary amounts of data are being produced in many branches of science. Proven
statistical methods are no longer applicable with extraordinary large datasets due to …

Granular ball sampling for noisy label classification or imbalanced classification

S **a, S Zheng, G Wang, X Gao… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
This article presents a general sampling method, called granular-ball sampling (GBS), for
classification problems by introducing the idea of granular computing. The GBS method …

Communication-efficient surrogate quantile regression for non-randomly distributed system

K Wang, B Zhang, F Alenezi, S Li - Information sciences, 2022 - Elsevier
Distributed system has been widely used to solve massive data analysis tasks. This article
targets on quantile regression on distributed system with non-randomly distributed massive …

Optimal subsampling for quantile regression in big data

H Wang, Y Ma - Biometrika, 2021 - academic.oup.com
We investigate optimal subsampling for quantile regression. We derive the asymptotic
distribution of a general subsampling estimator and then derive two versions of optimal …

Workshop report on basic research needs for scientific machine learning: Core technologies for artificial intelligence

N Baker, F Alexander, T Bremer, A Hagberg… - 2019 - osti.gov
Scientific Machine Learning (SciML) and Artificial Intelligence (AI) will have broad use and
transformative effects across the Department of Energy. Accordingly, the January 2018 Basic …

Optimal distributed subsampling for maximum quasi-likelihood estimators with massive data

J Yu, HY Wang, M Ai, H Zhang - Journal of the American Statistical …, 2022 - Taylor & Francis
Nonuniform subsampling methods are effective to reduce computational burden and
maintain estimation efficiency for massive data. Existing methods mostly focus on …

Optimal subsampling algorithms for big data regressions

M Ai, J Yu, H Zhang, HY Wang - Statistica Sinica, 2021 - JSTOR
In order to quickly approximate maximum likelihood estimators from massive data, this study
examines the optimal subsampling method under the A-optimality criterion (OSMAC) for …

More efficient estimation for logistic regression with optimal subsamples

HY Wang - Journal of machine learning research, 2019 - jmlr.org
In this paper, we propose improved estimation method for logistic regression based on
subsamples taken according the optimal subsampling probabilities developed in Wang et …