Cover: Operationally Relevant Artificial Training for Machine Learning

Operationally Relevant Artificial Training for Machine Learning

Improving the Performance of Automated Target Recognition Systems

Published Nov 18, 2020

by Gavin S. Hartnett, Lance Menthe, Jasmin Léveillé, Damien Baveye, Li Ang Zhang, Dara Gold, Jeff Hagen, Jia Xu


Download eBook for Free

FormatFile SizeNotes
PDF file 4.1 MB

Use Adobe Acrobat Reader version 10 or higher for the best experience.


Purchase Print Copy

 Format Price
Add to Cart Paperback43 pages $19.00

Research Questions

  1. Can an object-detection model trained on artificial images evaluate real images?
  2. Were there differences in the performance of models trained on purely real images, purely artificial images, and hybrids (those consisting of a combination of artificial and real images)?

Automated target recognition (ATR) is one of the most important potential military applications of the many recent advances in artificial intelligence and machine learning. A key obstacle to creating a successful ATR system with machine learning is the collection of high-quality labeled data sets. The authors investigated whether this obstacle could be sidestepped by training object-detection algorithms on data sets made up of high-resolution, realistic artificial images. The authors generated large quantities of artificial images of a high-mobility multipurpose wheeled vehicle (HMMWV) and investigated whether models trained on these images could then be used to successfully identify real images of HMMWVs. The authors obtained a clear negative result: Models trained on the artificial images performed very poorly on real images. However, they found that using the artificial images to supplement an existing data set of real images consistently results in a performance boost. Interestingly, the improvement was greatest when only a small number of real images was available. The authors suggest a novel method for boosting the performance of ATR systems in contexts where training data are scarce. Many organizations, including the U.S. government and military, are now interested in using synthetic or simulated data to improve machine learning models for a wide variety of tasks. One of the main motivations is that, in times of conflict, there may be a need to quickly create labeled data sets of adversaries' military assets in previously unencountered environments or contexts.

Key Findings

  • Although the authors found that artificial images cannot replace real images, artificial images can supplement an existing data set of real images to boost performance.
  • Models trained on artificial images performed very poorly on real images.
  • Hybrid training sets—those consisting of a combination of artificial and real images—produced better performance than algorithms trained on real images alone.
  • The improvements were most noticeable when the number of real images was severely limited.
  • By boosting a data set of five real images with ten artificial ones, the authors were able to improve the precision and recall of the model by 54 percent and 29 percent, respectively.


  • More research would be needed to determine under what conditions, if any, models trained successfully on artificial images might perform well on real images.
  • More work is also needed to better understand the ability of models trained on hybrid data sets to perform well on purely real images.

Research conducted by

Funding for this study was made possible by the independent research and development provisions of RAND's contracts for the operation of its U.S. Department of Defense federally funded research and development centers. The research was conducted within RAND Project AIR FORCE.

This report is part of the RAND research report series. RAND reports present research findings and objective analysis that address the challenges facing the public and private sectors. All RAND reports undergo rigorous peer review to ensure high standards for research quality and objectivity.

This document and trademark(s) contained herein are protected by law. This representation of RAND intellectual property is provided for noncommercial use only. Unauthorized posting of this publication online is prohibited; linking directly to this product page is encouraged. Permission is required from RAND to reproduce, or reuse in another form, any of its research documents for commercial purposes. For information on reprint and reuse permissions, please visit

RAND is a nonprofit institution that helps improve policy and decisionmaking through research and analysis. RAND's publications do not necessarily reflect the opinions of its research clients and sponsors.