Multiple Edit/Multiple Imputation for Multivariate Continuous Data
Published in: Journal of the American Statistical Association, v. 98, no. 464, Dec. 2003, p. 807-817
Posted on RAND.org on December 31, 2002
Multiple imputation replaces an incomplete dataset with m > 1 simulated complete versions that are analyzed separately by standard methods. We present a natural extension of multiple imputation for handling the dual problems of nonresponse and response error. This extension, which we call multiple edit/multiple imputation (MEMI), replaces an observed dataset containing missing values and errors with m > 1 simulated versions of the ideal dataset that is complete and error-free. These ideal data sets are analyzed separately, and the results are combined using the same rules as for multiple imputation. The resulting inferences simultaneously reflect uncertainty due to nonresponse and response error. MEMI may be an attractive alternative to deterministic or quasi-statistical edit and imputation procedures used by many data-collecting agencies. Producing MEMI's requires assumptions about the distribution of the ideal data, the nature of nonresponse, and a model for the response error mechanism. However, fitting such a model does not necessarily require data from a follow-up study. In this article we develop and implement MEMI for preliminary data from the Third National Health and Nutrition Examination Survey, Phase I (1988-1991). Raw body measurements for 1,345 children age 2-3 years are imputed under a Bayesian model for intermittent or semicontinuous errors. The resulting population estimates are found to be quite insensitive to prior assumptions about the rates and magnitude of errors.