Advertisement
Tech by VICE

This Site Shows You What AI Really Thinks of You

'ImageNet Roulette' exposes the racism and misogyny in commonly-used AI training data. We need more tools like it.

by Jordan Pearson
Sep 16 2019, 7:30pm

Have you ever wondered what a computer thinks of you when it automatically detects your face before applying a cat filter? Thanks to a new AI tool, you can find out, but fair warning: the reality isn't pretty.

"ImageNet Roulette" is a website created by programmer Leif Ryge for researcher Kate Crawford and artist Trevor Paglen's recent art exhibit "Training Humans." The site takes your photo and runs it through some common machine learning software before returning the labels that the AI decided to apply to you. As numerous people discovered (and tweeted about) while using the tool, these labels are often weird, mean, racist, and misogynistic.

For example, ImageNet Roulette described Denzel Washington with racist slurs and outdated, inappropriate language when I plugged the famous actor's photo into the software. (Journalist Stephen Bush reported a similar experience on Twitter using his own photo.) Author Naomi Klein was described by ImageNet Roulette as a "sister," as in a nun, when I tried her photo in ImageNet Roulette.

ImageNet Roulette is designed to reveal how biased AI can be under the hood. Machine learning has already seeped into many aspects of our lives, and as evidenced by the deluge of tweets about ImageNet Roulette, popular tools can still highlight uncomfortable truths about new technology that practitioners may have already accepted.

"ImageNet Roulette regularly classifies people in dubious and cruel ways," the site description says, because it's trained using images and labels from a massive—and massively popular—image dataset for training AI models called ImageNet. This database was created by academics, who pulled millions of images from the internet and paired with labels from an earlier semantic database called WordNet. It has been used and reused countless times, and has more than 12,000 citations in academic literature.

"ImageNet Roulette is meant in part to demonstrate how various kinds of politics propagate through technical systems," the site says, "often without the creators of those systems even being aware of them."

Machine learning systems depend on huge troves of "training" data (in ImageNet's case, labeled images) to "learn" how to identify and label things they've never seen before. As described in ImageNet's originating 2009 paper, Princeton researchers decided to create a robust training dataset for visual recognition tasks. They worked backwards from WordNet by searching for images on the internet using its terms, and got workers on Amazon's Mechanical Turk platform to confirm the appropriate labelling.

Like other shared datasets, ImageNet took on a life of its own after publication. Indeed, ImageNet Roulette is itself an example of the possibility for unanticipated use-cases for the database that can reproduce human prejudices. Thankfully, ImageNet Roulette exposes the inner workings of common machine learning approaches for the common good rather than obscure them. As AI advances both for good and for ill, we need more tools like it.

This isn't about the future anymore. It's already happening.