The U.S. military and intelligence community are interested in developing and deploying artificial intelligence (AI) systems to support intelligence analysis. The authors develop methods for assessing the impact AI systems are likely to have on intelligence missions that they support.
Evaluating the Effectiveness of Artificial Intelligence Systems in Intelligence Analysis
- How are AI system measures of performance connected with effectiveness in intelligence analysis?
- How might AI be used to support the intelligence process, both as reflected in the development of real systems and in hypothetical systems that may not yet be in development?
- How can researchers model the intelligence process for the purposes of determining how AI systems situated in this process affect it?
- What metrics exist to characterize the performance of AI systems?
The U.S. military and intelligence community have shown interest in developing and deploying artificial intelligence (AI) systems to support intelligence analysis, both as an opportunity to leverage new technology and as a solution for an ever-proliferating data glut. However, deploying AI systems in a national security context requires the ability to measure how well those systems will perform in the context of their mission.
To address this issue, the authors begin by introducing a taxonomy of the roles that AI systems can play in supporting intelligence—namely, automated analysis, collection support, evaluation support, and information prioritization—and provide qualitative analyses of the drivers of the impact of system performance for each of these categories.
The authors then single out information prioritization systems, which direct intelligence analysts' attention to useful information and allow them to pass over information that is not useful to them, for quantitative analysis. Developing a simple mathematical model that captures the consequences of errors on the part of such systems, the authors show that their efficacy depends not just on the properties of the system but also on how the system is used. Through this exercise, the authors show how both the calculated impact of an AI system and the metrics used to predict it can be used to characterize the system's performance in a way that can help decisionmakers understand its actual value to the intelligence mission.
Using metrics not matched to actual priorities obscures system performance and impedes informed choice of the optimal system
- Metric choice should take place before the system is built and be guided by attempts to estimate the real impact of system deployment.
Effectiveness, and therefore the metrics that measure it, can depend not just on system properties but also on how the system is used
- A key consideration for decisionmakers is the amount of resources devoted to the mission outside those devoted to building the system.
- Begin with the right metrics. This requires having a detailed understanding of the way an AI system will be used and choosing metrics that reflect success with respect to this utilization.
- Reevaluate (and retune) regularly. Because the world around the system continues to evolve after deployment, system evaluation must continue as a portion of regular maintenance.
- Speak the language. System designers have a well-established set of metrics for capturing the performance of AI systems, and being conversant in these traditional metrics will ease communication with experts during the process of designing a new system or maintaining an existing one.
- Conduct further research into methods of evaluating AI system effectiveness.
Table of Contents
Tracing Effectiveness from Mission to System
Measuring Performance and Effectiveness
Derivations and Technical Details