Anonymization or de-identification techniques are methods for protecting the privacy of subjects in sensitive data sets while preserving the utility of those data sets. The efficacy of these methods has come under repeated attacks as the ability to analyze large data sets becomes easier. Several researchers have shown that anonymized data can be reidentified to reveal the identity of the data subjects via approaches such as so-called "linking." In this report, we survey the anonymization landscape of approaches for addressing re-identification and we identify the challenges that still must be addressed to ensure the minimization of privacy violations. We also review several regulatory policies for disclosure of private data and tools to execute these policies.
Davis, John S. II and Osonde A. Osoba, Privacy Preservation in the Age of Big Data: A Survey. Santa Monica, CA: RAND Corporation, 2016. https://www.rand.org/pubs/working_papers/WR1161.html.
Davis, John S. II and Osonde A. Osoba, Privacy Preservation in the Age of Big Data: A Survey, Santa Monica, Calif.: RAND Corporation, WR-1161, 2016. As of May 12, 2022: https://www.rand.org/pubs/working_papers/WR1161.html