Implications of Missingness in Self-Reported Data for Estimating Racial and Ethnic Disparities in Medicaid Quality Measures
Published in: Health Services Research, Volume 57, Issue 6, pages 1370–1378 (December 2022). doi: 10.1111/1475-6773.14025
Posted on RAND.org on November 10, 2022
To assess the feasibility and implications of imputing race and ethnicity for quality and utilization measurement in Medicaid.
Data Sources and Study Setting
2017 Oregon Medicaid claims from the Oregon Health Authority and electronic health records (EHR) from OCHIN, a clinical data research network, were used.
We cross-sectionally assessed Hispanic-White, Black-White, and Asian-White disparities in 22 quality and utilization measures, comparing self-reported race and ethnicity to imputed values from the Bayesian Improved Surname Geocoding (BISG) algorithm.
Race and ethnicity were obtained from self-reported data and imputed using BISG.
42.5%/4.9% of claims/EHR were missing self-reported data; BISG estimates were available for >99% of each and had good concordance (0.87–0.95) with Asian, Black, Hispanic, and White self-report. All estimated racial and ethnic disparities were statistically similar in self-reported and imputed EHR-based measures. However, within claims, BISG estimates and incomplete self-reported data yielded substantially different disparities in almost half of the measures, with BISG-based Black-White disparities generally larger than self-reported race and ethnicity data.
BISG imputation methods are feasible for Medicaid claims data and reduced missingness to <1%. Disparities may be larger than what is estimated using self-reported data with high rates of missingness.