Propensity scores for repeated treatments: A tutorial for the iptw function in the twang package

This tutorial describes the use of the TWANG package in R to estimate inverse probability of treatment weights (IPTWs) when one has time varying treatments or sequences of treatments over time.


Introduction
While standard propensity score methods attempt to answer the question of how expected outcomes change if a group of individuals received one treatment instead of another, researchers are often interested in understanding how sequences of treatments impact outcomes of interest. In this case, time-varying confounders may be impacted by prior treatments. Consequently, simply controlling for the time-varying confounders in standard regression models can yield biased results. Instead, it is possible to perform weighted regressions that account for time-varying confounders via marginal structural models (MSMs; Robins et al., 2000). In this method, observations are weighted by the inverse of the estimated probability of receiving the observed sequence of treatments the individual actually received, referred to as an inverse probability of treatment weight (IPTW). It has been proposed to use nonparametric models to estimate IPTWs (Griffin et al., 2014). Accordingly, we refer to the function in twang that performs this weighting as iptw, for inverse probability of treatment weighting.
For binary treatments, the iptw methods and syntax build directly on the ps functionality; users are encouraged to study that tutorial before using iptw. For treatment regimes with more than two categories, the iptw methods build on the mnps methods and software. For more background on marginal structural models, see e.g., Robins et al. (2000) and Cole and Hernán (2008).

An IPTW example
For the sake of illustration, we simulated data to demonstrate the functionality of the iptw command. For time-varying treatment data, one can either imagine a "wide" dataset, with one row per subject, or a "long" dataset with one row for each subject/time point combination. Our artificial data include time-invariant characteristics gender, and age at time of study enrollment. Conceptually, we have a substance use index that is measured four times: at baseline, after the first treatment period, after the second treatment period, and after the third treatment period, which concludes the study and is the outcome of interest. In the "wide" version of the dataset called iptwExWide, we have the outcome, baseline and intermediate measures, use0, use1, and use2. The treatment indicators are, in chronological order, tx1, tx2, and tx3. Our goal is to estimate the average effect of each additional dose of treatment on substance use measured at the end of the study (which is recorded in outcome).
The "long" format data have a somewhat different form, and are included in the data object iptwExLong. For the long format, the outcomes are split from the covariates, and are available as iptwExLong$outcome. Similarly, the covariates and treatment indicators are available in covariates, which includes data elements gender, age, use, and tx; these include the same information as the wide version. Additionally, the long version contains elements ID (an individual-level identifier) and time, which corresponds to the study period.
One of the benefits of GBM for applied researchers is the automatic handling of missing data. For MSMs, however, this does not extend to partially observed treatment histories. We assume throughout that missingness exists only in the covariates.

Fitting the model
To begin, we will work with the "wide" version of the data, which are available after loading the twang package: > library(twang) > data(iptwExWide) Next, we can fit the model using the iptw function. Unlike for the standard ps function, we are only able to use a single stop.method at a time. The treatment assignment models are specified as a list of formulas, starting at the earliest time period. For coding parsimony, terms that should appear in all of the formulas can be specified once via a one-sided formula using the timeInvariant argument. Similarly, including treatment indicators from previous models is achieved by setting priorTreatment = TRUE. Finally, if all terms included at period t should be included in the period t + 1 model (as is typically the case in MSM models), setting cumulative = TRUE automatically includes all elements on the right-hand side of previous models. Thus, the model > iptw.Ex <-iptw(list(tx1~use0 + gender + age, + tx2~use1 + use0 + tx1 + gender + age, + tx3~use2 + use1 + use0 + tx2 + tx1 + gender + age), + timeInvariant~gender + age, + data = iptwExWide, + cumulative = FALSE, + priorTreatment = FALSE, + verbose = FALSE, + stop.method = "es.max", + n.trees = 5000) can be specified more simply as: After having fit the iptw object, the diagnostic checks are similar to those specified for ps objects.
First, we check to make sure that the GBM models were allowed to run long enough (i.e., n.trees is sufficiently large).
> plot(iptw.Ex, plots = 1) Next, we can check balance as measured by standardized mean differences between the treated and control samples at each of the time points by specifying plots = 3. As with other TWANG figures, we can specify color = FALSE to produce black and white figures. Finally, we can compare the differences between the treated and control samples using t-test and KS p-values by specifying plots = 4 and plots = 5, respectively. Further, iptw can accommodate treatments with more than two levels (McCaffrey et al., 2013). An example can be explored in the following call, though we do not discuss it further in this vignette. See the mnps vignette for more information on the diagnostic plots.

Estimating treatment effects
After having estimated the relevant propensity scores, the final step is translating them into analytic weights and estimating treatment effects. Twang provides several functions to facilitate this process. For this analysis, we assume an additive treatment model, where the mean change in outcomes depends on the number of periods of treatment. Because the weights often have substantial variation, the weights are commonly stabilized where the standard inverse probability of treatment weights are multiplied by the estimated probability of receiving the treatment that each individual received, conditioning only on previous periods' treatment indicators.

Conclusion
Frequently, researchers are interested in treatments that may vary period-by-period. Twang's iptw function provides a nonparametric method for calculating inverse probability of treatment weights for marginal structural models. The function can accommodate treatments with two or more levels. The diagnostic figures and tables build on of the mnps and ps commands, with additional features to help manage the numerous possible comparisons.