Download

Download eBook for Free

FormatFile SizeNotes
PDF file 1.1 MB

Use Adobe Acrobat Reader version 10 or higher for the best experience.

Purchase

Purchase Print Copy

 Format Price
Add to Cart Paperback128 pages $42.00

Research Questions

  1. What threat models should AI organizations defend against when protecting frontier AI models?
  2. How can the weights of frontier AI models be protected against a variety of attackers?
  3. How can frontier AI organizations assess which attackers they are protected against given their security posture?

As frontier artificial intelligence (AI) models — that is, models that match or exceed the capabilities of the most advanced models at the time of their development — become more capable, protecting them from theft and misuse will become more important. The authors of this report explore what it would take to protect model weights — the learnable parameters that encode the core intelligence of an AI — from theft by a variety of potential attackers.

Specifically, the authors (1) identify 38 meaningfully distinct attack vectors, (2) explore a variety of potential attacker operational capacities, from opportunistic (often financially driven) criminals to highly resourced nation-state operations, (3) estimate the feasibility of each attack vector being executed by different categories of attackers, and (4) define five security levels and recommend preliminary benchmark security systems that roughly achieve the security levels.

This report can help security teams in frontier AI organizations update their threat models and inform their security plans, as well as aid policymakers engaging with AI organizations in better understanding how to engage on security-related topics.

Key Findings

  • AI organizations face a diverse set of threats, across many meaningfully distinct attack vectors and a wide range of attacker capacities.
  • There is rough agreement among cybersecurity and national security experts on how to protect digital systems and information from less capable actors, but there is a wide diversity of views on what is needed to defend against more-capable actors, such as top cyber-capable nation-states.
  • The security of frontier AI model weights cannot be ensured by implementing a small number of "silver bullet" security measures. A comprehensive approach is needed, including significant investment in infrastructure and many different security measures addressing different potential risks.
  • There are many opportunities for significantly improving the security of model weights at frontier labs in the short term.
  • Securing model weights against the most capable actors will require significantly more investment over the coming years.

Recommendations

  • Developers of AI models should have a clear plan for securing models that are considered to have dangerous capabilities.
  • Organizations developing frontier models should use the threat landscape analysis and security level benchmarks detailed in the report to help assess which security vulnerabilities they are already addressing and focus on those they have yet to address.
  • Develop a security plan for a comprehensive threat model focused on preventing unauthorized access and theft of the model's weights.
  • Centralize all copies of weights to a limited number of access-controlled and monitored systems.
  • Reduce the number of people authorized to access the weights.
  • Harden interfaces for model access against weight exfiltration.
  • Implement insider threat programs.
  • Invest in defense-in-depth (multiple layers of security controls that provide redundancy in case some controls fail).
  • Engage advanced third-party red-teaming that reasonably simulates relevant threat actors.
  • Incorporate confidential computing to secure the weights during use and reduce the attack surface.

Research conducted by

Funding for this research provided by gifts from RAND supporters. The research was conducted by the Meselson Center within RAND Global and Emerging Risks.

This report is part of the RAND research report series. RAND reports present research findings and objective analysis that address the challenges facing the public and private sectors. All RAND reports undergo rigorous peer review to ensure high standards for research quality and objectivity.

This document and trademark(s) contained herein are protected by law. This representation of RAND intellectual property is provided for noncommercial use only. Unauthorized posting of this publication online is prohibited; linking directly to this product page is encouraged. Permission is required from RAND to reproduce, or reuse in another form, any of its research documents for commercial purposes. For information on reprint and reuse permissions, please visit www.rand.org/pubs/permissions.

RAND is a nonprofit institution that helps improve policy and decisionmaking through research and analysis. RAND's publications do not necessarily reflect the opinions of its research clients and sponsors.