AI Companies Say Safety Is a Priority. It's Not

commentary

Jul 9, 2024

AI concept with red and green bot heads facing opposite directions, photo by wildpixel/Getty Images

Photo by wildpixel/Getty Images

This commentary originally appeared on San Francisco Chronicle on July 9, 2024.

It could save us or it could kill us.

That's what many of the top technologists in the world believe about the future of artificial intelligence. This is why companies like OpenAI emphasize their dedication to seemingly conflicting goals: accelerating technological progress as rapidly—but also as safely—as possible.

It's a laudable intention, but not one of these many companies seems to be succeeding.

Take OpenAI, for example. The leading AI company in the world believes the best approach to building beneficial technology is to ensure that its employees are “perfectly aligned” with the organization's mission. That sounds reasonable except what does it mean in practice?

A lot of groupthink—and that is dangerous.

As social animals, it's natural for us to form groups or tribes to pursue shared goals. But these groups can grow insular and secretive, distrustful of outsiders and their ideas. Decades of psychological research have shown how groups can stifle dissent by punishing or even casting out dissenters. In the 1986 Challenger space shuttle explosion, engineers expressed safety concerns about the rocket boosters in freezing weather. Yet the engineers were overruled by their leadership, who may have felt pressure to avoid delaying the launch.

It could save us or it could kill us. That's what many of the top technologists in the world believe about the future of artificial intelligence.

Share on Twitter

According to a group of AI insiders, something similar is taking place at OpenAI. According to an open letter signed by nine current and former employees, the company uses hardball tactics to stifle dissent from workers about their technology. One of the researchers who signed the letter described the company as “recklessly racing” for dominance in the field.

It's not just happening at OpenAI. Earlier this year, an engineer at Microsoft grew concerned that the company's AI tools were generating violent and sexual imagery. He first tried to get the company to pull them off the market but when that didn't work, he went public. Then, he said, Microsoft's legal team demanded he delete the LinkedIn post. In 2021, former Facebook project manager Frances Haugen revealed internal research that showed the company knew the algorithms—often referred to as the building blocks of AI—that Instagram used to surface content for young users were exposing teen girls to images that were harmful to their mental health. When asked in an interview with “60 Minutes” why she spoke out, Haugen responded, “Person after person after person has tackled this inside of Facebook and ground themselves to the ground.”

Leaders at AI companies claim they have a laser focus on ensuring that their products are safe. They have, for example, commissioned research, set up “trust and safety” teams, and even started new companies to help achieve these aims. But these claims are undercut when insiders paint a familiar picture of a culture of negligence and secrecy that—far from prioritizing safety—instead dismisses warnings and hides evidence about unsafe practices, whether to preserve profits, avoid slowing progress, or simply to spare the feelings of leaders.

So what can these companies do differently?

As a first step, AI companies could ban nondisparagement or confidentiality clauses. The OpenAI whistleblowers asked for that in their open letter and the company says it has already taken such steps. But removing explicit threats of punishment isn't enough if an insular workplace culture continues to implicitly discourage concerns that might slow progress.

Rather than simply allowing dissent, tech companies could encourage it, putting more options on the table. This could involve, say, beefing up the “bug bounty” programs that tech companies already use to reward employees and customers who identify flaws in their software. Companies could embed a “devil's advocate” role inside software or policy teams that would be charged with opposing consensus positions.

AI companies might also learn from how other highly skilled, mission-focused teams avoid groupthink. Military special operations forces prize group cohesion but recognize that cultivating dissent—from anyone, regardless of rank or role—might prove the difference between life and death. For example, Army doctrine—fundamental principles of military organizations—emphasizes (PDF) that special operations forces must know how to employ small teams and individuals as autonomous actors.

Finally, organizations already working to make AI models more transparent could shed light on their inner workings. Secrecy has been ingrained in how many AI companies operate; rebuilding public trust could require pulling back that curtain by, for example, more clearly explaining safety processes or publicly responding to criticism.

With AI, the stakes of silencing those who don't toe the company line, instead of viewing them as vital sources of mission-critical information, are too high to ignore.

Share on Twitter

To be sure, group decisionmaking can benefit (PDF) from pooling information or overcoming individual biases, but too often it results in overconfidence or conforming to group norms. With AI, the stakes of silencing those who don't toe the company line, instead of viewing them as vital sources of mission-critical information, are too high to ignore.

It's human nature to form tribes—to want to work with and seek support from a tight group of like-minded people. It's also admirable, if grandiose, to adopt as one's mission nothing less than building tools to tackle humanity's greatest challenges. But AI technologies will likely fall short of that lofty goal—rapid yet responsible technological advancement—if its developers fall prey to a fundamental human flaw: refusing to heed hard truths from those who would know.


Douglas Yeung is a senior behavioral scientist at RAND and a member of the Pardee RAND Graduate School faculty.