Anonymization or de-identification techniques are methods for protecting the privacy of subjects in sensitive data sets while preserving the utility of those data sets. The efficacy of these methods has come under repeated attacks as the ability to analyze large data sets becomes easier. Several researchers have shown that anonymized data can be reidentified to reveal the identity of the data subjects via approaches such as so-called "linking." In this report, we survey the anonymization landscape of approaches for addressing re-identification and we identify the challenges that still must be addressed to ensure the minimization of privacy violations. We also review several regulatory policies for disclosure of private data and tools to execute these policies.
This research was conducted by RAND Justice, Infrastructure, and Environment.
This report is part of the RAND Corporation working paper series. RAND working papers are intended to share researchers' latest findings and to solicit informal peer review. They have been approved for circulation by RAND but may not have been formally edited or peer reviewed.
Permission is given to duplicate this electronic document for personal use only, as long as it is unaltered and complete. Copies may not be duplicated for commercial purposes. Unauthorized posting of RAND PDFs to a non-RAND Web site is prohibited. RAND PDFs are protected under copyright law. For information on reprint and linking permissions, please visit the RAND Permissions page.
The RAND Corporation is a nonprofit institution that helps improve policy and decisionmaking through research and analysis. RAND's publications do not necessarily reflect the opinions of its research clients and sponsors.