On Monday, after an invite-only testing period, artificial intelligence company Stability AI released its text-to-image generation model, called Stable Diffusion, into the world as open-access. Like DALL-E Mini or Midjourney, Stable Diffusion is capable of creating vivid, even photorealistic images from simple text prompts using neural networks. Unlike DALL-E Mini or Midjourney, whose creators have implemented limits on what kind of images they can produce, Stable Diffusion’s model can be downloaded and tweaked by users to generate whatever they want. Inevitably, many of them are generating porn.
Stability AI has had more than 15,000 beta testers helping develop the model, and offered a release to researchers only earlier this month. But the model has leaked well before Monday’s official release—which means that even before the open-access launch, and despite the company’s ban on generating explicit or sexual images, people have been hard at work churning out the horniest stuff possible, the only limit being what they can put into words.
Videos by VICE
This is all being done in defiance of Stability AI’s weekslong warnings to not use its model for anything sexual. As AI-generated hentai nudes started appearing on forums and social media earlier this month, the company wrote in a Twitter announcement (and to the discredit of cool moms everywhere): “Don’t generate anything you’d be ashamed to show your mother.”
The images include everything from hentai, to computer-generated celebrity nudes, to naked images of people who don’t really exist. Some of the results almost look convincing, while other results are horrific, generating impossible bodies with errant limbs and distorted faces.
“If you want 18+ make it on your own GPUs when the model is released,” Stability AI tweeted. The Stable Diffusion beta and DreamStudio—a browser-based application anyone can use to generate images using stable diffusion—both forbid “NSFW, lewd, or sexual material,” and have content filters in place to prevent it.
“ …while some of the images are simply pornographic, and that’s fine, I found many of the images to be incredibly artistic and beautiful in a way you just don’t get in that many places”
Now that the model is public, and anyone can copy it to run on their own machines, people are doing exactly that, in droves.
The open access to Stable Diffusion’s model is what sets it apart from many of the other publicly available (but not open-access) AI text-to-image generation apps out there, including DALL-E and Midjourney, which have content filters that forbid nudity or sexual imagery. Being able to take the model, run it on one’s own machine at home or in a research lab, and not on a company’s servers where the filters are pre-set and not adjustable, opens up a world of porn.
Four subreddits dedicated to its NSFW uses—r/unstablediffusion, r/PornDiffusion, r/HentaiDiffusion, and r/stablediffusionnsfw—had nearly 2,000 members total, while the main Stable Diffusion subreddit has 8,000. (As of Thursday, after this piece was published, Reddit banned each of these subreddits except for r/HentaiDiffusion, citing its policies against non-consensual intimate media. Reddit declined to comment further on the bans.)
People interested in the project have been discussing and debating the sexually explicit outputs the model is capable of for weeks, with some people who have apparent access to leaked or beta versions of the model posting generated nudes as examples. On the image board 4chan, for example, people have been leaking the model and posting their creations for days leading up to the release.
Out of these subreddits grew into a Discord server, started by someone who moderates r/Singularity as well as r/UnstableDiffusion, r/PornDiffusion and r/HentaiDiffusion, who goes by Ashley22 on the Unstable Diffusion Discord.
“The truth is, I created this server because while some of the images are simply pornographic, and that’s fine, I found many of the images to be incredibly artistic and beautiful in a way you just don’t get in that many places,” they told Motherboard. “I don’t like the pervasive, overwhelming prudishness and fear of erotic art that’s out there, I started this server because I believe that these models can change our perception of things like nudity, taboo, and what’s fair by giving everyone the ability to communicate with images.”
“There is a novelty in experiencing NSFW content generated by AI,” Redditor johnnyornot, who started and moderates r/stablediffusionnsfw, told Motherboard. “It’s almost a little bit taboo to view it knowing it’s been generated by a robot. And there’s also the raw curiosity about whether it will capture the complexity of what causes human arousal, or will it be a distorted, alien mimicry of it.”
Despite all of this interest, Emad Mostaque, the founder of Stability AI, told me that he thinks erotic uses of the model are overhyped. “In general I think NSFW gets too much attention, the parameters of this vary widely and the vast majority of the usage we have seen of this has been creative and great,” he said.
But a lot of the content coming out of NSFW-focused generator communities are, arguably, making “creative and great works,” and pushing the model to its technical limits. There’s a ton of hentai, which might not be everyone’s yum but isn’t harmful. Many people are creating non-anime fantasy people and scenarios, like a nude sword-wielding barbarian woman, a cosplayer playing Pokemon trainer Misty, and lots of elven ladies. Most of them are kind of rough, with an eye sliding down a cheek, or an extra arm, but the mind is very good at filling in blanks or ignoring the odd anomaly.
To get these results, people are crafting and sharing long, descriptive prompts for others to try entering into the model, like “Oil painting nude white naked princess realistic exposed symmetrical breasts and realistic exposed thighs of (subject) with gorgeous detailed eyes, the sky, color page, tankoban, 4 k, tone mapping, doll, akihiko yoshida, james jean, andrei riabovitchev, marc simonetti, yoshitaka amano, long hair, curly.”
“Seems like the model knows how to do naked people better than clothed from what i have seen,” one person in the Unstable Diffusion Discord wrote.
Stability AI told people to go wild with their fantasies once the model was made public, but as part of licensing the model, users have to agree to its “Misuse, Malicious Use, and Out-of-Scope Use” terms. Stability AI uses a CreativeML OpenRAIL-M license that’s copying the DALLE-Mini terms of use for the open-access version, and forbids “generating images that people would foreseeably find disturbing, distressing, or offensive; or content that propagates historical or current stereotypes.”
All of this is classic platform liability speak. It encapsulates such a vague, broad range of subjective experiences that it could feasibly disallow both all content, images, or outputs one could ever conceive of, given a sensitive enough viewer, and nothing at all. Mostly, when similar terms are used on other platforms, it gets interpreted as anything sexually explicit, allowing moderators to remove whatever doesn’t suit investors or advertisers at the moment.
The open-access model of Stable Diffusion isn’t a platform, however, and the company can’t really enforce these rules or dictate what people do with it once it’s being used on someone else’s computer. The model is “customisable to the end users legal, ethical and moral position,” according to Mostaque. “Users of the model must abide by the terms of the CreativeML OpenRAIL-M license. We endorse and approve usage within these parameters, beyond that it is end user responsibility.”
Stabile Diffusion uses a dataset called LAION-Aesthetics. According to Techcrunch, Mostaque funded the creation of LAION 5B, the open source, 250-terabyte dataset containing 5.6 billion images scraped from the internet (LAION is short for Large-scale Artificial Intelligence Open Network, the name of a nonprofit AI organization). LAION-400M, the predecessor to LAION 5B, notoriously contained abhorrent content; a 2021 preprint study of the dataset found that it was full of “troublesome and explicit images and text pairs of rape, pornography, malign stereotypes, racist and ethnic slurs, and other extremely problematic content.” The dataset is so bad that the Google Research team that made Imagen, another powerful text-to-image diffusion model, refused to release its model to the public because it used LAION-400M and couldn’t guarantee that it wouldn’t produce harmful stereotypes and representations.
Stable Diffusion testers and AI filters whittled LAION 5B down to two billion images by training a model to predict the rating people would give when presented with an image and asked “How much do you like this image on a scale from 1 to 10?” and the resulting 120 million images is the LAION-Aesthetics dataset. Techcrunch reported that LAION-Aesthetics was allegedly meant to cull pornographic images out.
Ashley22 told me that the LAION-Aesthetics dataset is part of what makes the model especially good for making erotic images. “That aesthetic beauty is not specifically erotic, but I think due to the high quality visual quality of the dataset it is especially well-suited to NSFW,” they said.
From my own testing of the model using Stability AI’s DreamStudio platform, which runs the model from its own servers, the “Aesthetics” team didn’t catch all the nudes, by a long shot. Using a workaround code written and shared by someone in the Unstable Diffusion Discord, and inserted into the browser-based app that partially disabled the content filter, I was able to make NSFW images using “sex” and “breasts,” and the model generated some fairly hardcore surrealist pornographic images (including one where a woman literally is a man’s penis?).
Mostaque said that DreamStudio will update shortly to allow customisation of filters and parameters. “The classifier can also be used to exclude other things, for example Clowns if one is scared of them and understands concepts and abstract notions. This has broader use in increasing internet safety.”
So far, most of the images people are generating with Stable Diffusion aren’t problematic, but harmlessly horny. But like any AI-generated porn community, people simply cannot resist creating fake porn of real people—especially celebrities. Celebrities’ images are all over the internet, and it’s fair to assume, all over the LAION datasets. The allure of famous women as imagined by an algorithm with their breasts and genitalia exposed is too great for some people, and the Stable Diffusion NSFW communities are already seeing this happen.
Five years after machine learning researchers and engineers first started throwing their hands up about fake, nonconsensual porn being generated by AI hobbyists—which remains the most prevalent use of deepfake technology—people in these communities are still shrugging about it when confronted with the problem.
I asked johnnyornot what he thought of people going this route. “I guess mimicking an existing human could harm the target’s reputation, even if the images are not actually of them,” they said. “Perhaps they should all be labeled as AI generated. Perhaps we should learn to not believe everything we see online. There is probably a lot to consider with this and on the wider topic of AI in general.”
Ashley22 sees it as fair use. “If I draw a picture of Keanu Reeves naked, that doesn’t violate consent laws, I think that this is comparable,” they said. “And people do stuff like that all the time, we tend not to care unless people try to use the images commercially.”
Right now, the results are still much too rough to even begin to trick anyone into thinking they’re real snapshots of nudes, but these creations are definitely malicious deepfake-adjacent; AI ethicists have said this use of algorithmically-generated likenesses is harmful and is widely considered non-consensual intimate imagery.
As of writing, the majority of users are more interested in creating elf-hentai fantasies than images of celebrities. Unlike with deepfakes, which rely on videos and photos from real life to work toward a standard of believability, people making working with this new generation of text-to-image models seem less concerned with realism or believability, and more interested in what Shrek would look like with huge boobs, for example. With an AI’s nearly unlimited scope of billions of combinations, synthetizations, and fetishizations made from whatever people can dream up and put into words, will the pastime of making images of topless celebrities become too boring?
“Make no mistake, these models are heading towards multimodality, it will get better, more realistic, and absolutely more widespread,” Ashley22 said. “The implication is that someday, just about anyone will be able to make just about any media.”
Update 8/25, 1:30 p.m. EST: This piece was updated to reflect that Reddit banned r/stablediffusionnsfw, r/PornDiffusion, and r/UnstableDiffusion.