Generalizing Treatment Effects from Trials to EHR Populations using Propensity Score Predictive Inference

Oct 14, 2025·

Jungang Zou

Joseph E. Schwartz

Nathalie Moise

Roderick Little

Qixuan Chen

· 0 min read

Abstract

Although randomized controlled trials provide strong internal validity, they often lack external validity when attempting to generalize results to broader populations. This limitation, known as generalizability, arises when trial participants are not representative of the target population of interest. To address this challenge, we develop a novel interaction-based Propensity Score Predictive Inference (PSPI) framework that emphasizes the central role of propensity scores for trial participation, combined with flexible outcome models. We introduce three PSPI variants, including two robust estimators for average treatment effects and potential outcomes across treatment groups by incorporating natural cubic spline of the propensity score and modeling high-dimensional covariates using Bayesian Additive Regression Trees. Our approach enhances both the efficiency and interpretability of generalizability analyses. Simulation studies show that PSPI models outperform existing methods, achieving lower mean squared error and near-nominal coverage rates, particularly in settings with treatment imbalance or covariate shift between trial participants and the target population. We further demonstrate the utility of our approaches by generalizing the treatment effect of a multi-level, multi-component depression intervention from a randomized trial to the full population of eligible patients identified through electronic health records.

Type

Preprint

Publication

Under preparation

Last updated on Oct 14, 2025

Bayesian Additive Regression Trees; Causal Inference; Clinical Trials; Electronic Health Records; Generalizability; Propensity Score.

Authors

Jungang Zou (he/him)

Ph.D. student

Multi-level variable selection using a BART-enhanced mixed-effects framework Sep 30, 2025 →