People like to think computer programs are objective, but technology is actually learning from our human biases.
Last month, a team of researchers from Boston University and Microsoft Research came up with a solution: an algorithm capable of identifying and eradicating offensive stereotypes in writing. The algorithm analyzes how often seemingly gender-neutral words are paired with gendered pronouns like "she" or "he" to find out which words a computer would most likely associate with men or women.
The team expected the algorithm, which processed a three-million word corpus of Google News stories, to exhibit gender bias. But the results were even more extreme than they had predicted: for occupation searches, women were disproportionately billed as a "hairdresser", "socialite" or "nanny", while men were associated with "maestro", "skipper" or "protegé".
"I was surprised by how apparent and concrete these biases proved," said James Zou, one of the researchers behind the project. "Our method makes it very clear that there are these deeply embedded stereotypes in the machine learning algorithm."
If gender bias continues to infiltrate machine learning, it would likely make gender disparity even worse.
The researchers also asked the algorithm to complete a set of analogies, drawing parallels based on the text it had analyzed. Some—such as "man is to woman as king is to queen"—were relatively innocuous, others much less so. When asked what the female equivalent of "surgeon" was, the algorithm returned "nurse". When asked the female equivalent of "computer programmer", it returned "homemaker". This is especially poignant in industries where women are already paid thousands of dollars less than men.
Gender bias is frustratingly familiar to working women. Around 83 percent of women in the US believe their office discriminates based on gender, with 45 percent have experienced it personally. If gender bias continues to infiltrate machine learning, it would likely make gender disparity even worse. Imagine a student using a search algorithm to find papers on computer programming. If the search engine they use has learned to process language on a biased dataset, it might associate programming with male names rather than female names. A paper by Mary might rank below a paper by Mark, making it harder to find, and probably less widely read.
"Machine learning algorithms are everywhere in our society," explains Zou. "Every time we interact with our smartphones or our computer, we're interacting with them. And because they're trained on datasets containing stereotypes, they propagate those same stereotypes, sometimes in very damaging ways."
And that bias extends beyond gender. The team analysed the same Google News corpus for signs of racial bias, and found similar results.
"If you give the algorithm a stereotypically white name, say Emily, and a stereotypically black name like Ebony, it generates a bunch of biased analogies, like singer-songwriter to rapper or pancakes to fried chicken, which are very racially biased," said Adam Kalai, another researcher on the project. "It's shocking, and more so when you think this is being used in a variety of places."
As machine learning becomes more prevalent, the need to rid algorithms of bias becomes more pressing. Luckily, computers can be rid of these prejudices far more easily than humans."People are biased," Kalai said. "Everyone makes assumptions. But with computers it's different. With just a little bit of programming, we can remove those associations."
Kalai sees a huge advantage to removing bias from language before it infiltrates machine learning further. But he sees his role as mechanical, not moral. The algorithm gives programmers the option to de-bias their work—but it doesn't come with an obligation.
"Everyone makes assumptions. But with computers it's different. With just a little bit of programming, we can remove those associations."
"People can choose to negate biases, or go further and take affirmative action," Kalai said, or they could make it worse. "We're researchers, so we don't feel our role is to choose which biases to remove and which to keep."
Zou sees it a little differently. If the algorithm can de-bias our language, he says, then it should. "Machine learning has reached a fairly mature stage, and so has AI in general. It's important for us to think about the social contexts and impacts of our work."
"Even a small degree of prejudice in machine learning can be amplified. If we're not careful about stereotypes, that will have concrete, detrimental social effects."