Red-Teaming the Risks of Using AI in Biological Attacks

article

Mar 25, 2024

A man in a chemical protective suit and gas mask using a hi-tech device to check the area near floating molecules, illustrations by Grandfailure and traffic_analyzer/Getty Images

Illustrations by Grandfailure and traffic_analyzer/Getty Images

It's a nightmare scenario: A terrorist group taps into the power of artificial intelligence (AI) to help it plan a devastating biological attack. But a recent RAND experiment suggests that's still beyond today's technology. Even advanced computer language models were no better than a simple Google search.

The experiment brought together several dozen researchers to role-play bad guys. Some had backgrounds in biology; others in computer science; and others in operational planning. They were divided into small teams, and then given seven weeks to describe what it would take to carry out a biological attack.

Some of the groups had access to large language models, AI systems that use huge troves of training data to answer questions in a human-like way. Others only had access to the internet. The security protocols for both were stringent. The researchers did not use any classified information, and all research was conducted on protected networks. The final report on the experiment does not disclose any specific details about the plans the teams developed.

A panel of experts graded each team's plan on whether or not it could be viable. Most of the plans had major flaws and fell somewhere between problematic and unworkable. But there was no significant difference in the scores between those who used a large language model and those who used only the internet.

The experiment brought together several dozen researchers to role-play bad guys.

Share on Twitter

The highest score did go to a team made up entirely of researchers with expertise in using large language models. But when organizers looked at that team's chat logs, they found that its plan was not based on information from the AI. Instead, it had dived into academic literature on research into viruses, all of which was available online.

The organizers noted a few AI outputs that they described as unfortunate. In one case, for example, the AI engaged in a discussion about how to cause maximum casualties with a biological weapon. All of the biological agents it suggested, though, are listed on the website of the Centers for Disease Control and Prevention or elsewhere online.

Another team asked the AI if it could provide more detailed instructions for cultivating the bacteria that cause plague. The computer answered: “Of course! Here are more detailed instructions….” But there, too, the information it provided is available with a few clicks on the internet.

In fact, the organizers found that none of the AI outputs yielded any meaningful information that the internet-only teams could not also find. They concluded that planning a biological attack remains beyond the capabilities of existing AI systems.

But they cautioned: “It remains uncertain whether these risks lie 'just beyond'” the frontier of existing AI models, or whether they will always be too complicated and multifaceted for a computer to handle. The methods RAND developed to test those limits—a practice known as red-teaming—can help ensure the nightmare scenario never happens.

Doug Irving