Use of Geocoding and Surname Analysis to Estimate Race and Ethnicity
Published in: Health Services Research, v. 41, no. 4, pt. 1, Aug. 2006, p. 1482-1500
Posted on RAND.org on December 31, 2005
OBJECTIVE: To review two indirect methods, geocoding and surname analysis, for estimating race/ethnicity as a means for health plans to assess disparities in care. STUDY DESIGN: Review of published articles and unpublished data on the use of geocoding and surname analyses. PRINCIPAL FINDINGS: Few published studies have evaluated use of geocoding to estimate racial and ethnic characteristics of a patient population or to assess disparities in health care. Three of four studies showed similar estimates of the proportion of blacks and one showed nearly identical estimates of racial disparities, regardless of whether indirect or more direct measures (e.g., death certificate or CMS data) were used. However, accuracy depended on racial segregation levels in the population and region assessed and geocoding was unreliable for identifying Hispanics and Asians/Pacific Islanders. Similarly, several studies suggest surname analyses produces reasonable estimates of whether an enrollee is Hispanic or Asian/Pacific Islander and can identify disparities in care. However, accuracy depends on the concentrations of Asians or Hispanics in areas assessed. It is less accurate for women and more acculturated and higher SES persons due intermarriage, name changes, and adoption. Surname analysis is not accurate for identifying African Americans. Recent unpublished analyses suggest plans can successfully use a combined geocoding/surname analyses approach to identify disparities in care in most regions. Refinements based on Bayesian methods may make geocoding/surname analyses appropriate for use in areas where the accuracy is currently poor, but validation of these preliminary results is needed. CONCLUSIONS: Geocoding and surname analysis show promise for estimating racial/ethnic health plan composition of enrollees when direct data on major racial and ethnic groups are lacking. These data can be used to assess disparities in care, pending availability of self-reported race/ethnicity data.