The Case for and Against AI Watermarking

commentary

Jan 17, 2024

AI on advanced central processing unit with a judge's gavel, photo by Dragon Claws/Getty Images

Photo by Dragon Claws/Getty Images

This commentary originally appeared on Federal Times on January 16, 2024.

In October, a White House executive order on artificial intelligence invoked the use of watermarking to meet safety, security, and trustworthiness goals for AI outputs like images, videos, or text. This was followed by language in a defense authorization bill calling for a competition to evaluate these technologies.

But watermarking things created by AI is far more complicated than these proclamations might suggest.

The hype around the concept of AI-output watermarking may be masking the very real system challenges that so far have thwarted widespread adoption of such technologies for uses other than AI, such as tracking digital content or managing copyrights.

To start, it is important to understand the constraints of watermarking systems. Any successful watermarking system must balance three issues: robustness (what kinds of processing the watermark should be able to survive), fidelity (how much change to the original object is allowed), and capacity (how many bits, or 1's and 0's, the mark must carry). These form a classic “trade-triangle” where—all else being equal—increasing one property generally occurs at the expense of the others.

Any successful watermarking system must balance robustness, fidelity, and capacity. Increasing one property generally occurs at the expense of the others.

Share on Twitter

As framed by the White House, the purpose of the watermark is to help people identify AI-generated objects, which would mean applying the marks to the objects in a way that is both reliable and trustworthy. This requires aligning the trade-offs in a way that faithfully embodies that goal while not allowing misuse of the system, such as applying watermarks to non-AI generated objects or removing watermarks from AI generated objects.

The good news is that since AI objects are generated, many traditional watermarking concerns about impact to the “original” work that has limited past efforts are alleviated. However, an AI-watermarking solution must still balance sufficient capacity to allow it to be bound permanently to the object to prove when it was marked and by whom while being robust enough to stand up to operations by any kind of software. Tension arises as the number of changes available to use in watermarking is limited, and so the carrying information for one purpose are then not available for another.

To be undermined, a watermark need not be erased but only rendered undetectable—an act called desynchronization, which can occur when watermark bits are altered. Even benign users are likely to use or share AI-derived content in ways that risk the integrity and synchronization of watermarks: resaving, cropping, or editing an image are likely to interfere with the ability of the mark to be read. Malicious actors may undertake intrusive actions that are visually minor but destructive like small rotations, or transcoding: saving content to different formats, such as PNG to JPEG to GIF. They also might initiate more complex actions like combining objects to produce outputs with multiple marks or attempting to place illegitimate marks on legitimate objects.

To be successful, the overall watermarking system must be designed, from the algorithm to the software, to address questions such as: What is the purpose of the mark, and what info should it carry? Who should be able to read the mark, and when? How should the lack of a mark, or the presence of multiple marks, be interpreted?

Malicious Intent

Absence of a mark can only inform a user that an object is not unaltered AI-output, but says nothing about malicious intent. The presence of a mark is less a definitive indicator of AI-generated content, and more a flag of suspicion. Even in closed systems these issues have stymied the deployment of watermarking systems.

If such questions are not answered, media consumers may be overwhelmed with false negatives like content that is clearly—or not so clearly—AI generated, but that lacks a watermark; and false positives like when false watermarks are added to objects. The watermarking algorithm itself can become a vehicle for novel and powerful attacks against the information ecosystem.

The watermarking algorithm itself can become a vehicle for novel and powerful attacks against the information ecosystem.

Share on Twitter

Beyond the goals of the White House and competition called for in the defense bill, the government may wish to consider broadening its investigation into watermarking technologies to address these issues and establish the bounds and expectations placed on this technology. This may involve establishing metrics and frameworks that allow technologies to be considered relative to the broader policy goals.

Importantly, this must include examination of the cost in terms of development, computer power, and infrastructure, as well as the cost of establishing the user trust and goodwill required for a watermarking system to function effectively. These things must be considered relative to the benefits that watermarking AI outputs may present, and questions asked about viability of usable systems and the cost-benefit they provide. Failure to address watermarking as a systems problem is likely to result in solutions that risk working against the security and trust they seek to instill.


Chad Heitzenrater is a senior information scientist at the nonprofit, nonpartisan RAND Corporation.

More About This Commentary

Commentary gives RAND researchers a platform to convey insights based on their professional expertise and often on their peer-reviewed research and analysis.