Getty Images

Algorithms Have Nearly Mastered Human Language. Why Can’t They Stop Being Sexist?

To fight gender bias, researchers are training language-processing algorithms to envision a world where it doesn’t exist.

Teaching computers to understand human language used to be a tedious and imprecise process. Now, language algorithms analyze oceans of text to teach themselves how language works. The results can be unsettling, such as when the Microsoft bot Tay taught itself to be racist after a single day of exposure to humans on Twitter.

It turns out that data-fueled algorithms are no better than humans—and frequently, they’re worse.


“Data and datasets are not objective; they are creations of human design,” writes data researcher Kate Crawford. When designers miss or ignore the imprint of biased data on their models, the result is what Crawford calls a “signal problem,” where “data are assumed to accurately reflect the social world, but there are significant gaps, with little or no signal coming from particular communities.”

Siri, Google Translate, and job applicant tracking systems all use the same kind of algorithm to talk to humans. Like other machine learning systems, NLPs (short for “natural language processors” or sometimes “natural language programs”) are bits of code that comb through vast troves of human writing and churn out something else––insights, suggestions, even policy recommendations. And like all machine learning applications, a NLP program’s functionality is tied to its training data––that is, the raw information that has informed the machine’s understanding of the reading material.

Skewed data is a very old problem in the social sciences, but machine learning hides its bias under a layer of confusion. Even AI researchers who work with machine learning models––like neural nets, which use weighted variables to approximate the decision-making functions of a human brain––don’t know exactly how bias creeps into their work, let alone how to address it.

As NLP systems creep into every corner of the digital world, from job recruitment software to hate speech detectors to police data, that signal problem grows to fit the size of its real-world container. Every industry that uses machine language solutions risks contamination. Algorithms given jurisdiction over public services like healthcare frequently exacerbate inequalities, excusing the ancient practice of shifting blame the most vulnerable populations for their circumstances in order to redistribute the best services to the least in need; models that try to predict where crime will occur can wind up making racist police practices even worse.



The first step to fixing a sexist algorithm is admitting it’s not “broken.” Machine bias is human bias, and it can start polluting the data before the decision-making code even starts to run. The selling point of machine learning is that it teaches itself, to varying degrees, how to learn from the data it is given.

“You don’t tell an NLP the grammatical rules of the language explicitly,” said Ran Zmigrod, a PhD student in computer science at the University of Cambridge who specializes in “debiasing” these models.

Instead, Zmigrod explained, the code uses training data to identify the language’s important rules, and then applies those rules to the task on a smaller, more focused dataset. One way a model might do this is with Markov chains, which estimate how closely associated two elements are by seeing how well the presence of one predicts the presence of the other. For example, it checks whether having “homemaker” in a text correlates with the word “she.”

If that just sounds like a fancy way of running the numbers, it’s because that’s exactly what it is. “People see machine learning as something supernatural, but I just view it as a very elaborate statistics,” Andrea Eunbee Jang, a research intern at the Mila Artificial Intelligence Institute in Quebec, told Motherboard.

Jang is part of a project at Mila that, last year, developed a taxonomy for all the types of gender bias found in NLP datasets. The programmer masculinity problem, according to Jang, is a prime example of a “gender generalization,” or an assumption based on over-valuing of gender stereotypes. “In machine learning in general, data will be tailored for the majority group,” Jang said. Added fellow researcher Yasmeen Hitti: “If the text the model is based on has only seen male programmers, it will assume that the programmer you’re typing about is also male. It just depends on what it’s seen before.”



Though projects that scrape new data from Twitter and YouTube have begun to dot the academic landscape, the traditional “ground truth” datasets used to build NLPs come from free collections—like the e-book archives at Project Gutenberg, collections of movie and restaurant reviews translated to plain text for machine learning applications, and dictionary entries. These huge piles of sentences are chosen to represent a range of language forms, but not necessarily a range of perspectives.

The term “ground truth” comes from meteorology and refers to trustworthy data that comes straight from the source, often acting as a reality check for information that relies on remote-sensing technology. For a scientist tracking a hurricane with a Doppler radar, the storm chaser taking pictures on location holds the ground truth, proving that the radar’s readouts can generally be trusted. The problem arises when, in machine learning, the ground truth is already skewed––like if a storm-chaser called in 80 percent of hurricanes with male names, but only recognized 50 percent of Katrinas and Irmas.

With NLPs, the complexity of language gives bias an additional opportunity to worm its way in to the results. In order to teach their systems to read, developers often borrow supplementary datasets that contain extra information about how words are used, known as “word embeddings.” Projects like Word2vec supplement the text being studied with minute insights about what words are most like what other words, which helps algorithms produce language that follows rules even more subtle than grammar: the rules of meaning.


Essentially, word embeddings find parallels in usage, according to María De-Arteaga, a Carnegie Mellon researcher specializing in machine learning and public policy. “Word embeddings are very popular, especially when you don’t have much data,” De-Arteaga told Motherboard.

Embeddings are the reason Google search is so powerful and chatbots can follow a conversation––their insights into how words are used by real people are fundamental to a NLP’s understanding of a language. But those insights––the data contained in Word2vec and other word embedding datasets––are also trained on old books and movie reviews, and sometimes tweets. An algorithm that turns to embeddings to understand the real world is still encumbered by the biases of every human voice represented in that data.

“If you want to use word embeddings to analyze how women have been represented in the meaning, then the presence of that bias is actually useful,” De-Arteaga said, “but if you’re saying you’re using it as a dictionary, as your ground truth, then you’re considering bias as truth.”

De-Arteaga’s recent projects focus on limiting the skewing power of word embeddings in different contexts––for example, by using “fairness constraints” to force the model to take away points for accuracy when it relies too much on stereotypes. Another approach she’s tried: “scrubbing” the data, or completely removing gender-linked aspects of words from the dataset before analysis. Both scrubbing and fairness constraints help reduce sexist outputs, but not enough, says De-Arteaga.



Ran Zmigrod is part of a new cohort of researchers searching for fairness in the training data itself––including in the ground truth. His group at Cambridge manipulated their algorithm’s training data on purpose by toying with their ground truth to represent a less sexist world. Essentially, they pick out every sentence in the corpus that contains gendered language and double it with different pronouns––so for every sentence in the corpus like “He is a programmer,” the model adds “She is a programmer” to the data as well (Zmigrod is still working on the gender-neutral version). The result is a gender-balanced corpus that is based on a different world than ours, but produces a remarkably fair result. “We’re not generating new ideas,” insists Zmigrod; “we’re just changing the gender of the corpus. Unless you’re specifically trying to look at gender in the text, the changes we make won’t matter to the result.”

Fixing gender imbalance in datasets is only one of the many interdependent efforts to address the injustices propagated by machine learning applications. De-Arteaga says it’s impossible to know which job placement services are using which kinds of language processing systems, but the influence of these programs already affects the hiring market as a whole. Outside of applicant tracking, it’s hard to measure the impact of biased NLPs, especially because machine learning applications can seem so opaque to anyone outside the AI field.


One overlooked cause of that opacity is the fact that known debiasing methods are especially weak in languages with grammatical gender––like Spanish, which puts neckties and women in the same grammatical class ( la corbata; la mujer) in contrast to, say, dresses and men ( el vestido; el hombre). The whole NLP field is absurdly English-centric, but gender debiasing approaches are especially hard to translate. The models wind up having to choose between gender balance and correct grammar, which means they’re useless either way.

A key exception to that is Zmigrod’s approach, which has shown promise in gendered languages like Spanish and Hebrew, though it takes a lot of processing power to keep up with all the standard gendered articles and endings, let alone new pronouns to describe nonbinary or gender-neutral identities.

Attempts to regulate the use of NLPs containing bias––that is, all of them––may be somewhere on the distant political horizon. But until then, it’s possible that the best way to intervene in a biased pipeline is to start with the researchers themselves. Yasmeen Hitti and Andrea Eunbee Jang of Mila initially came together as part of the AI for Social Good summer lab, which takes aim at machine learning bias by bringing women researchers onto AI projects early on.

Hitti, Jang, and fellow researcher Carolyne Pelletier are now deep in the data-mining phase of their current project on gender generalizations, but they’re also looking ahead to new ways to build justice into the pipeline.

“In our paper, we talk about […] two genders, male and female, but we also consulted with non-binary activists to see if our model could be adapted to their needs,” explained Pelletier. “But it’s not that different. When you think about it, sentences like ‘A programmer must always carry his laptop’ are biased against both she programmers and they programmers.”

Finally, there is the source of the data: human bias. “Our goal is to train models [to be less sexist], but maybe in the end, it’s easier to train a human,” Hitti told Motherboard. “We’re spending a lot of time trying to teach the machine, but we have intelligence, too. Maybe we could put a little effort into being more inclusive.”