RAND Bayesian Improved Surname Geocoding

Advancing equity through data science

RAND Bayesian Improved Surname Geocoding

Advancing equity through data science

Close-up photo of person working on laptop computer.

Photo by littlehenrabi/Getty Images

Data can be a powerful tool in helping organizations measure—and mitigate—disparities among racial and ethnic groups.

But a given dataset's potential to examine and reduce such disparities can be hindered by missing or incomplete racial, ethnic, and other socioeconomic or demographic information.

Bayesian Improved Surname and Geocoding (BISG) is a methodology developed by the RAND Corporation that can help U.S. organizations produce accurate, cost-effective estimates of racial and ethnic disparities within datasets—and illuminate areas for improvement.

Get Started with BISG

How Does the RAND BISG Method Work?

When self-reported data about race and ethnicity are unavailable or limited, RAND's indirect estimation method can generate useful estimates of race and ethnicity.

The RAND approach efficiently combines two commonly used methods to estimate race and ethnicity: geocoded address and surname, refining Census data for this application.


How Can BISG Be Used?

BISG has been used to assess disparities, support quality improvement, monitor the effects of interventions, and more.1

Other potential use cases include:

  • Estimating the racial and ethnic composition of a patient population
  • Comparing racial and ethnic differences in health care quality and outcomes
  • Coordinating community-level outreach and interventions
  • Comparing effectiveness of interventions

When Is It Appropriate to Use Indirect Estimation Methods Like BISG?

Incorporating patient perspectives in guidelines can help clinicians engage their patients in shared decisionmaking and jointly make decisions that are based in clinical and patient-generated evidence.

Patient perspectives add a valuable dimension to clinical guidelines because they can help identify factors that may affect guideline adherence.

Knowing what patients want, value, or find challenging about a given treatment can improve care quality and patient experiences.

How Accurate Is BISG?

The RAND indirect estimation method was found to be 41 percent more efficient than using surnames only, and 108 percent more efficient than using geocoding only.3

BISG estimates are strongly predictive of self-reported race and ethnicity for the four largest racial and ethnic groups in the U.S.2 Predictive accuracy is measured using the C-statistic, also called the Concordance Statistic.

The C-statistic ranges from 0.5 (no predictiveness) to 1.0 (perfect predictiveness). C-statistics for the BISG methodology are 0.94 for Asian/Pacific Islander, 0.93 for Black, 0.94 for Hispanic, and 0.93 for White.

When used for making inferences about groups for which accuracy is this high, estimates of race and ethnicity allow for more accurate measurement of disparities than approaches that are limited to data with high rates of selective nonresponse or administrative error.

BISG can be customized for different data sets, which may further increase predictive accuracy.

Can I Use BISG Method at My Organization?

Yes. RAND's BISG method is the most well-validated and widely used racial and ethnic data estimation method. BISG has been endorsed by the National Academy of Medicine and other entities.

If you are planning to use BISG for your project or report, please reach out to us via email at bisg@rand.org for guidance on appropriate citations, applying the methodology, or interpreting results.

For more information about how to use the method, contact us.

Medicare BISG

RAND also developed Medicare BISG (MBISG), a specialized version for use with the Centers for Medicare & Medicaid Services data. MBISG uses additional Medicare data to improve administrative measures of race and ethnicity of Medicare beneficiaries.

BISG in Action

RAND Research Behind the Method

Explore more ways RAND is committed to mitigating racial and ethnic disparities through the RAND Center to Advance Racial Equity Policy.

Get Started with BISG

BISG is in the public domain and free for use without charge, subject to these terms and conditions.

If you have questions or would like to discuss hiring RAND to assist you with a customized implementation of BISG for your organization, please email bisg@rand.org or complete the form.


1 Marc N. Elliott, "Imputation Methods for Increasing Racial/Ethnic Data Disaggregation," webinar, October 8, 2021. As of November 29, 2021: https://healthsurveynetwork.org/past-activities/ Back to content

2 Marc N. Elliott, Peter A. Morrison, Allen Fremont, Daniel F. McCaffrey, Philip Pantoja, and Nicole Lurie, "Using the Census Bureau's Surname List to Improve Estimates of Race/Ethnicity and Associated Disparities," Health Services and Outcomes Research Methodology, Vol. 9, No. 2, June 2009, pp. 69–83. Back to content

3 Elliott et al., 2009. Back to content