Collaborative Targeted Learning Using Regression Shrinkage
Published in: Statistics in Medicine [Epub September 2017]. doi: 10.1002/sim.7527
Posted on RAND.org on December 15, 2017
Causal inference practitioners are routinely presented with the challenge of model selection and, in particular, reducing the size of the covariate set with the goal of improving estimation efficiency. Collaborative targeted minimum loss-based estimation (CTMLE) is a general framework for constructing doubly robust semiparametric causal estimators that data-adaptively limit model complexity in the propensity score to optimize a preferred loss function. This stepwise complexity reduction is based on a loss function placed on a strategically updated model for the outcome variable through which the error is assessed using cross-validation. We demonstrate how the existing stepwise variable selection CTMLE can be generalized using regression shrinkage of the propensity score. We present 2 new algorithms that involve stepwise selection of the penalization parameter(s) in the regression shrinkage. Simulation studies demonstrate that, under a misspecified outcome model, mean squared error and bias can be reduced by a CTMLE procedure that separately penalizes individual covariates in the propensity score. We demonstrate these approaches in an example using electronic medical data with sparse indicator covariates to evaluate the relative safety of 2 similarly indicated asthma therapies for pregnant women with moderate asthma.