Harnessing Constructive Simulations for Reinforcement Learning

Gary J. Briggs

ResearchPublished Aug 8, 2024

Reinforcement learning (RL) is a powerful artificial intelligence (AI) technique for the development of software agents that make intelligent decisions and exhibit complex behaviors. RL works by applying feedback from the environment in the form of rewards and penalties to induce agents to learn how to succeed in that environment. RAND researchers have developed a flexible software harness that enables the use of state-of-the-art RL methods in many existing constructive simulations without requiring significant additional coding. RL can be used to train software agents in constructive simulations to make decisions desirable by the operator or to behave more realistically. The harness provides a simple interface that allows developers to use their current model’s programming language and offers excellent performance in terms of speed and memory because of its unique approach to model execution.

This report is the sixth in a series addressing how AI could be employed to assist warfighters in four distinct areas: cybersecurity, predictive maintenance, wargames, and mission planning. This report is aimed primarily at those with an interest in mission planning, operations research, and AI applications more generally.

Key Findings

  • RAND researchers have developed a flexible software harness that enables the use of state-of-the-art RL methods in many existing constructive simulations without requiring significant additional coding.
  • RL is a powerful AI technique that can be used to train software agents in constructive simulations to make decisions desirable by the operator or behave more realistically.
  • Most modern RL gyms (for training software agents) are written in Python, whereas some of the most widely used constructive simulations, such as the Air Force Research Laboratory's Advanced Framework for Simulation, Integration, and Modeling (AFSIM), are written in other programming languages.
  • The RAND RL software harness isolates agent training from agent employment, allowing researchers to use agents trained in modern RL gyms within existing constructive simulations written in C++.
  • Researchers at RAND have demonstrated the harness in AFSIM for the case of an aircraft attempting to penetrate an adversary's integrated air defense system.
  • RAND's RL software harness has been made available to all authorized users on the Air Force Research Laboratory's AFSIM portal.

Topics

Document Details

Citation

RAND Style Manual
Briggs, Gary J., Harnessing Constructive Simulations for Reinforcement Learning, RAND Corporation, RR-A1722-6, 2024. As of September 17, 2024: https://www.rand.org/pubs/research_reports/RRA1722-6.html
Chicago Manual of Style
Briggs, Gary J., Harnessing Constructive Simulations for Reinforcement Learning. Santa Monica, CA: RAND Corporation, 2024. https://www.rand.org/pubs/research_reports/RRA1722-6.html.
BibTeX RIS

Research conducted by

The research reported here was commissioned by Air Force Materiel Command, Strategic Plans, Programs, Requirements and Assessments (AFMC/A5/8/9) and conducted within the Force Modernization and Employment Program of RAND Project AIR FORCE.

This publication is part of the RAND research report series. Research reports present research findings and objective analysis that address the challenges facing the public and private sectors. All RAND research reports undergo rigorous peer review to ensure high standards for research quality and objectivity.

This document and trademark(s) contained herein are protected by law. This representation of RAND intellectual property is provided for noncommercial use only. Unauthorized posting of this publication online is prohibited; linking directly to this product page is encouraged. Permission is required from RAND to reproduce, or reuse in another form, any of its research documents for commercial purposes. For information on reprint and reuse permissions, please visit www.rand.org/pubs/permissions.

RAND is a nonprofit institution that helps improve policy and decisionmaking through research and analysis. RAND's publications do not necessarily reflect the opinions of its research clients and sponsors.