Cover: Using Artificial Intelligence  to Generate Synthetic Health Data

Using Artificial Intelligence to Generate Synthetic Health Data

Published Oct 3, 2023

by Federico Girosi, Sai Prathyush Katragadda, Joshua Steier, Raffaele Vardavas

Download eBook for Free

FormatFile SizeNotes
PDF file 14.7 MB

Use Adobe Acrobat Reader version 10 or higher for the best experience.

Generating synthetic data enables making sensitive data sets available to the research community. This report utilizes two off-the-shelf methods to generate synthetic health data. One method, synthpop, is based on standard statistical techniques. The other, CTGAN, is a deep learning generative adversarial network. The authors compare the performance of the methods and discuss the reasons for which synthpop outperforms CTGAN on these data.

Research conducted by

The research described in this report was conducted by RAND Health Care.

This report is part of the RAND working paper series. RAND working papers are intended to share researchers' latest findings and to solicit informal peer review. They have been approved for circulation by RAND but may not have been formally edited or peer reviewed.

This document and trademark(s) contained herein are protected by law. This representation of RAND intellectual property is provided for noncommercial use only. Unauthorized posting of this publication online is prohibited; linking directly to this product page is encouraged. Permission is required from RAND to reproduce, or reuse in another form, any of its research documents for commercial purposes. For information on reprint and reuse permissions, please visit

RAND is a nonprofit institution that helps improve policy and decisionmaking through research and analysis. RAND's publications do not necessarily reflect the opinions of its research clients and sponsors.