AI Spits Out Exact Copies of Training Images, Real People, Logos, Researchers Find

The regurgitation of training data exposes image diffusion models to a number of privacy and copyright risks.
AI Spits Out Exact Copies of Training Images, Real People, Logos, Researchers Find
Image: Carlini, Hayes, et. al. 

Researchers have found that image-generation AI tools such as the popular Stable Diffusion model memorize training images—typically made by real artists and scraped for free from the web—and can spit them out as nearly-identical copies. 

According to a preprint paper posted to arXiv on Monday, researchers extracted over a thousand training examples from the models, which included everything from photographs from individual people, to film stills and copyrighted press photos, to trademarked company logos, and found that the AI regurgitated many of them nearly exactly. 


When so-called image diffusion models—a category that includes Stable Diffusion, OpenAI's DALL-E 2, and Google's Imagen—are fed different images as training data, the idea is that they are able to add noise to images, learn to remove the noise, and after that, produce original images using that learning process based on a prompt by a human user. Such models have been the focus of outrage because they are trained on work from real artists (typically, without compensation or consent), with allusions to their provenance emerging in the form of repeating art styles or mangled artist signatures. 

However, the researchers of the paper demonstrate that sometimes the AI model will generate the exact same image it was trained on with only inconsequential changes like more noise in the image. 

“The issue of memorization is that in the process of training your model, it might sort of overfit on individual images, where now it remembers what that image looks like, and then at generation time, it inadvertently can regenerate that image,” one of the paper’s co-authors Eric Wallace, a Ph.D. student at the University of Berkeley, told Motherboard. “So it's kind of an undesirable quantity where you want to minimize it as much as possible and promote these kinds of novel generations."

One example the researchers provide is an image of American evangelist Ann Graham Lotz, taken from her Wikipedia page. When Stable Diffusion was prompted with “Ann Graham Lotz,” the AI spit out the same image, with the only difference being that the AI-generated image was a bit noisier. The distance between the two images was quantified by the researchers as having nearly identical pixel compositions, which qualified the image as being memorized by the AI. 


The researchers demonstrated that a non-memorized response can still accurately depict the text that the model was prompted with, but would not have a similar pixel makeup and would deviate from any training images. When they prompted Stable Diffusion with “Obama,” an image that looked like Obama was produced, but not one that matched any image in the training dataset. The researchers showed that the four nearest training images were very different from the AI-generated image.  

The ability of diffusion models to memorize images creates a major copyright issue when models reproduce and distribute copyrighted material. The ability to regenerate pictures of certain individuals in a way that still maintains their likenesses, such as in Obama’s case, also poses a privacy risk to people who may not want their images being used to train AI. The researchers also found that many of the images used in the training dataset were copyrighted images that were used without permission.

“Despite the fact that these images are publicly accessible on the Internet, not all of them are permissively licensed,” the researchers wrote. “We find that a significant number of these images fall under an explicit non-permissive copyright notice (35%). Many other images (61%) have no explicit copyright notice but may fall under a general copyright protection for the website that hosts them (e.g., images of products on a sales website).” 


In total, the researchers got the models to nearly identically reproduce over a hundred training images. Wallace said that the numbers reported are an "undercount of how much memorization might actually be happening" because they were only counting instances when the AI "exactly" reproduced an image, rather than something merely very close to the original. 

“This is kind of an industry-wide problem, not necessarily a Stability AI problem,” Wallace said. “I think there is a lot of past work already talking about this indirect copying or style copying of images, and our work is one very extreme example, where there are some cases of near-identical memorization in the training set. So I think there's potential that [our results] would change things from a legal or moral perspective when you're developing new systems.” 

In the study, the researchers conclude that diffusion AI models are the least private type of image-generation model. For example, they leak more than twice as much training data as Generative Adversarial Networks (GANs), an older type of image model. The researchers hope to warn developers of the privacy risks of diffusion models that include a number of issues, such as the ability to misuse and duplicate copyrighted and sensitive private data, including medical images, and be vulnerable to outside attacks in which training data can be easily extracted. A solution that the researchers propose is to flag where generated images duplicate training images and remove those images from the training dataset. 


Motherboard previously looked through the dataset that AI image generators like Stable Diffusion and Imagen were trained on, called LAION-5B. Unlike the researchers, who decided to manually extract the training data, we used a site called Have I Been Trained, which allows you to search through images in the dataset. We found that the training dataset contains artists’ copyrighted work and NSFW images such as leaked celebrity nudes and ISIS beheadings. 

Although OpenAI has since taken steps to prevent NSFW content from appearing and deduplicated its training dataset for DALL-E 2 in June to prevent regurgitation of the same photo, the concern is that with each iteration that is released to the public, there is information and training data that remains permanently public. 

“The issue here is that all of this is happening in production. The speed at which these things are being developed and a whole bunch of companies are kind of racing against each other to be the first to get the new model out just means that a lot of these issues are fixed after the fact with a new version of the model coming out,” paper co-author and assistant professor of computer science at ETH Zürich, Florian Tramèr, told Motherboard. 

“And, of course, the older versions are then still there, and so sometimes the cat is a little bit out of the bag once you've made one of these mistakes," he added. "I'm kind of hopeful that as things go forward, we sort of reach a point in this community where we can iron out some of these issues before putting things out there in the hands of millions of users.” 

OpenAI, Stability AI, and Google did not immediately respond to requests for comment.