To the casual observer, DALL-E is Silicon Valley’s latest miraculous AI creation—a machine learning system that allows anyone to generate almost any image just by typing a short description into a text box. From just a few descriptive words, the system can conjure up an image of cats playing chess, or a teapot that looks like an avocado.
It’s an impressive trick using the latest advances in natural language processing, or NLP, which involves teaching algorithmic systems how to parse and respond to human language—often with creepily realistic results. Named after both surrealist painter Salvador Dalí and the lovable Pixar robot WALL-E, DALL-E was created by research lab OpenAI, which is well-known in the field for creating the groundbreaking NLP systems GPT-2 and GPT-3.
But just like those previous experiments, DALL-E suffers from the same racist and sexist bias AI ethicists have been warning about for years.
Machine learning systems almost universally exhibit bias against women and people of color, and DALL-E is no different. In the project’s documentation on GitHub, OpenAI admits that “models like DALL·E 2 could be used to generate a wide range of deceptive and otherwise harmful content” and that the system “inherits various biases from its training data, and its outputs sometimes reinforce societal stereotypes.” The documentation comes with a content warning that states “this document may contain visual and written content that some may find disturbing or offensive, including content that is sexual, hateful, or violent in nature, as well as that which depicts or refers to stereotypes.”
It also says that the use of DALL-E “has the potential to harm individuals and groups by reinforcing stereotypes, erasing or denigrating them, providing them with disparately low quality performance, or by subjecting them to indignity. These behaviors reflect biases present in DALL-E 2 training data and the way in which the model is trained.”
The examples of this from DALL-E’s preview code are pretty bad. For instance, including search terms like “CEO” exclusively generates images of white-passing men in business suits, while using the word “nurse” or “personal assistant” prompts the system to create images of women. The researchers also warn the system could be used for disinformation and harassment, for example by generating deepfakes or doctored images of news events.
In a statement emailed to Motherboard, an OpenAI spokesperson wrote that the researchers had implemented safeguards for the DALL-E system, and noted that the preview code is currently only available to a select number of trusted users who have agreed to its content policy.
“In developing this research release of DALL-E, our team built in mitigations to prevent harmful outputs, curating the pretraining data, developing filters and implementing both human- and automated monitoring of generated images,” the spokesperson wrote. “Moving forward, we’re working to measure how our models might pick up biases in the training data and explore how tools like fine-tuning and our Alignment techniques may be able to help address particular biases, among other areas of research in this space.”
Some AI experts say that the core of this problem is not a lack of mitigations, but the increasing use of large language models (LLMs), a type of AI template that includes hundreds of billions of parameters, allowing engineers to teach machine learning systems to perform a variety of tasks with relatively little training. AI researchers have criticized large models like GPT-3 for producing horrifying results that reinforce racist and sexist stereotypes, arguing that the massive nature of these models is inherently risky and makes auditing the systems virtually impossible. Before being fired from Google, AI ethicist Timnit Gebru co-authored a paper which warned of the dangers of LLMs, specifically noting their ability to harm marginalized groups.
OpenAI offers no solutions to these issues, saying that it is in the early stages of examining bias in the DALL-E system and that its risk analysis should be regarded as preliminary.
“We are sharing these findings in order to enable broader understanding of image generation and modification technology and some of the associated risks, and to provide additional context for users of the DALL·E 2 Preview,” the researchers write. “Without sufficient guardrails, models like DALL·E 2 could be used to generate a wide range of deceptive and otherwise harmful content, and could affect how people perceive the authenticity of content more generally.”
This article has been updated with a statement from OpenAI.