Operationally Relevant Artificial Training for Machine Learning

Improving the Performance of Automated Target Recognition Systems

by Gavin S. Hartnett, Lance Menthe, Jasmin Leveille, Damien Baveye, Li Ang Zhang, Dara Gold, Jeff Hagen, Jia Xu

Download

Download eBook for Free

FormatFile SizeNotes
PDF file 4.3 MB

Use Adobe Acrobat Reader version 10 or higher for the best experience.

Purchase

Purchase Print Copy

 FormatList Price Price
Add to Cart Paperback43 pages $19.00 $15.20 20% Web Discount

Research Questions

  1. Can an object-detection model trained on artificial images evaluate real images?
  2. Were there differences in the performance of models trained on purely real images, purely artificial images, and hybrids (those consisting of a combination of artificial and real images)?

Automated target recognition (ATR) is one of the most important potential military applications of the many recent advances in artificial intelligence and machine learning. A key obstacle to creating a successful ATR system with machine learning is the collection of high-quality labeled data sets. The authors investigated whether this obstacle could be sidestepped by training object-detection algorithms on data sets made up of high-resolution, realistic artificial images. The authors generated large quantities of artificial images of a high-mobility multipurpose wheeled vehicle (HMMWV) and investigated whether models trained on these images could then be used to successfully identify real images of HMMWVs. The authors obtained a clear negative result: Models trained on the artificial images performed very poorly on real images. However, they found that using the artificial images to supplement an existing data set of real images consistently results in a performance boost. Interestingly, the improvement was greatest when only a small number of real images was available. The authors suggest a novel method for boosting the performance of ATR systems in contexts where training data are scarce. Many organizations, including the U.S. government and military, are now interested in using synthetic or simulated data to improve machine learning models for a wide variety of tasks. One of the main motivations is that, in times of conflict, there may be a need to quickly create labeled data sets of adversaries' military assets in previously unencountered environments or contexts.

Key Findings

  • Although the authors found that artificial images cannot replace real images, artificial images can supplement an existing data set of real images to boost performance.
  • Models trained on artificial images performed very poorly on real images.
  • Hybrid training sets—those consisting of a combination of artificial and real images—produced better performance than algorithms trained on real images alone.
  • The improvements were most noticeable when the number of real images was severely limited.
  • By boosting a data set of five real images with ten artificial ones, the authors were able to improve the precision and recall of the model by 54 percent and 29 percent, respectively.

Recommendations

  • More research would be needed to determine under what conditions, if any, models trained successfully on artificial images might perform well on real images.
  • More work is also needed to better understand the ability of models trained on hybrid data sets to perform well on purely real images.

Table of Contents

  • Chapter One

    Introduction

  • Chapter Two

    Operationally Relevant Data

  • Chapter Three

    Methodology

  • Chapter Four

    Results

  • Chapter Five

    Conclusions

  • Appendix A

    The U.S. Military's Use of Bohemia Interactive Products

  • Appendix B

    Additional Model and Hyperparameter Details

Research conducted by

Funding for this study was made possible by the independent research and development provisions of RAND's contracts for the operation of its U.S. Department of Defense federally funded research and development centers. The research was conducted within RAND Project AIR FORCE.

This report is part of the RAND Corporation research report series. RAND reports present research findings and objective analysis that address the challenges facing the public and private sectors. All RAND reports undergo rigorous peer review to ensure high standards for research quality and objectivity.

Permission is given to duplicate this electronic document for personal use only, as long as it is unaltered and complete. Copies may not be duplicated for commercial purposes. Unauthorized posting of RAND PDFs to a non-RAND Web site is prohibited. RAND PDFs are protected under copyright law. For information on reprint and linking permissions, please visit the RAND Permissions page.

The RAND Corporation is a nonprofit institution that helps improve policy and decisionmaking through research and analysis. RAND's publications do not necessarily reflect the opinions of its research clients and sponsors.