The Urge to Merge

Linking Vital Statistics Records and Medicaid Claims

Robert M. Bell, Joan Keesey, Toni Richards

ResearchPosted on rand.org 1994Published in: Medical Care, v. 32, no. 10, Oct. 1994, p. 1004-1018

This paper describes a procedure used to link Medicaid claims data to California vital statistics records for very low birthweight infants. The linkage involved about 53,000 infants born from 1980 to 1987 and 1.46 million claims for delivery/birth-related hospital admissions during the same period. Because the two data files did not share a unique identifier, record linkage required combining evidence across several linking variables: delivery hospital, delivery/birth date or hospitalization period, names, mother's age, and zip code. To combine the various pieces of evidence, we used record linkage theory to compute scores that measure the likelihood of a match, i.e., that two records correspond to the same delivery. These scores appropriately weight the various pieces of evidence for or against a match. Implementation required dealing with large amounts of missing data in one of the files, errors and variations in reported names, and the need to minimize the number of incorrect links. The approach applies to a wide range of linkage problems. The ability to combine existing datasets to form new datasets containing analysis variables from each facilitates analyses that would otherwise be impossible, or prohibitively expensive.

Topics

Document Details

  • Availability: Non-RAND
  • Year: 1994
  • Pages: 15
  • Document Number: EP-199400-27

This publication is part of the RAND external publication series. Many RAND studies are published in peer-reviewed scholarly journals, as chapters in commercial books, or as documents published by other organizations.

RAND is a nonprofit institution that helps improve policy and decisionmaking through research and analysis. RAND's publications do not necessarily reflect the opinions of its research clients and sponsors.