Download eBook for Free

FormatFile SizeNotes
PDF file 2.9 MB

Use Adobe Acrobat Reader version 10 or higher for the best experience.

Research Questions

  1. How well does red teaming work in practice as a method to identify new and emerging risks and capabilities of foundation models, and what are its limitations?
  2. By examining a spectrum of policy options, how do we ensure red teaming can effectively identify new risks, including potential trade-offs and wider or unintended consequences?
  3. What are the key considerations for policy development in assessing risks associated with foundation models, and what further questions need to be researched?

On 12 September 2023, RAND Europe and the Centre for Long-Term Resilience organised a virtual workshop to inform UK government thinking on policy levers to identify risks from artificial intelligence foundation models in the lead up to the AI Safety Summit in November 2023. The workshop focused on the use of red teaming for risk identification, and any opportunities, challenges and trade-offs that may arise in using this method.

The workshop brought together a range of participants from across academia and public sector research organisations, non-governmental organisations and charities, the private sector, the legal profession and government. The workshop consisted of interactive discussions among the participants in plenary and in smaller breakout groups. The views and ideas discussed at the workshop have been summarised in this short report to stimulate further debate and thinking as policy around this topical issue develops in the coming months.

Key Findings

The discussion focused on the following themes associated with the use of red teaming with AI foundation models to identify risks:

  • The term 'red teaming' is loosely used across the global AI community. A crucial first step is to develop a clear and shared taxonomy, along with shared norms and good practice around red teaming, for example, regarding who to involve, how to implement it and how to share findings.
  • Red teaming is one specific tool that is part of the wider risk identification, assessment and management toolbox. It is not a governance mechanism in itself.
  • Red teaming is useful in certain cases, in particular medium-term risks and assessment of known risks. Key limitations of red teaming included identifying unknown or chronic risks.
  • The socio-technical aspect of red teaming – who does it and in what context – must be actively considered. Embedding a diversity of perspectives, with deep understanding of the risks, the domain, and the actors or adversaries, is essential to improve a red team's effectiveness.
  • Specific methods such as red teaming should not be the focal point of mandated risk-management activities. If mandates are put in place, they should instead focus on holistic approaches and risk-management frameworks.

Research conducted by

This work was funded by the RAND Corporation and conducted by the Centre for Long-Term Resilience and RAND Europe.

This report is part of the RAND conference proceeding series. RAND conference proceedings present a collection of papers delivered at a conference or a summary of the conference.

This document and trademark(s) contained herein are protected by law. This representation of RAND intellectual property is provided for noncommercial use only. Unauthorized posting of this publication online is prohibited; linking directly to this product page is encouraged. Permission is required from RAND to reproduce, or reuse in another form, any of its research documents for commercial purposes. For information on reprint and reuse permissions, please visit www.rand.org/pubs/permissions.

RAND is a nonprofit institution that helps improve policy and decisionmaking through research and analysis. RAND's publications do not necessarily reflect the opinions of its research clients and sponsors.