Valid inference for machine learning-assisted genome-wide association studies

J Miao, Y Wu, Z Sun, X Miao, T Lu, J Zhao, Q Lu - Nature genetics, 2024 - nature.com
Abstract Machine learning (ML) has become increasingly popular in almost all scientific
disciplines, including human genetics. Owing to challenges related to sample collection and …

Active statistical inference

T Zrnic, EJ Candès - arxiv preprint arxiv:2403.03208, 2024 - arxiv.org
Inspired by the concept of active learning, we propose active inference $\unicode {x2013} $
a methodology for statistical inference with machine-learning-assisted data collection …

From narratives to numbers: Valid inference using language model predictions from verbal autopsy narratives

S Fan, A Visokay, K Hoffman, S Salerno, L Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
In settings where most deaths occur outside the healthcare system, verbal autopsies (VAs)
are a common tool to monitor trends in causes of death (COD). VAs are interviews with a …

Another look at inference after prediction

J Gronsbell, J Gao, Y Shi, ZR McCaw… - arxiv preprint arxiv …, 2024 - arxiv.org
Prediction-based (PB) inference is increasingly used in applications where the outcome of
interest is difficult to obtain, but its predictors are readily available. Unlike traditional …

Predictions as surrogates: Revisiting surrogate outcomes in the age of ai

W Ji, L Lei, T Zrnic - arxiv preprint arxiv:2501.09731, 2025 - arxiv.org
We establish a formal connection between the decades-old surrogate outcome model in
biostatistics and economics and the emerging field of prediction-powered inference (PPI) …

Do We Really Even Need Data?

K Hoffman, S Salerno, A Afiaz, JT Leek… - arxiv preprint arxiv …, 2024 - arxiv.org
As artificial intelligence and machine learning tools become more accessible, and scientists
face new obstacles to data collection (eg rising costs, declining survey response rates) …

On the Role of Surrogates in Conformal Inference of Individual Causal Effects

C Gao, PB Gilbert, L Han - arxiv preprint arxiv:2412.12365, 2024 - arxiv.org
Learning the Individual Treatment Effect (ITE) is essential for personalized decision making,
yet causal inference has traditionally focused on aggregated treatment effects. While …

Prediction de‐correlated inference: A safe approach for post‐prediction inference

F Gan, W Liang, C Zou - Australian & New Zealand Journal of …, 2024 - Wiley Online Library
In modern data analysis, it is common to use machine learning methods to predict outcomes
on unlabelled datasets and then use these pseudo‐outcomes in subsequent statistical …

Prediction-Powered Inference with Imputed Covariates and Nonuniform Sampling

DM Kluger, K Lu, T Zrnic, S Wang, S Bates - arxiv preprint arxiv …, 2025 - arxiv.org
Machine learning models are increasingly used to produce predictions that serve as input
data in subsequent statistical analyses. For example, computer vision predictions of …

ipd: An R Package for Conducting Inference on Predicted Data

S Salerno, J Miao, A Afiaz, K Hoffman, A Neufeld… - …, 2025 - academic.oup.com
Abstract Summary ipd is an open-source R software package for the downstream modeling
of an outcome and its associated features where a potentially sizable portion of the outcome …