There are lots of conversations about the lack of diversity in science and tech these days. In response, people constantly ask, "So what? Why does it matter?" There are many ways to answer that question, but perhaps the easiest is this: because a homogenous team produces homogenous products for a very heterogeneous world.
This is Design Bias, a monthly Motherboard column in which writer Rose Eveleth explores the products, research programs, and conclusions made not necessarily because any designer or scientist or engineer sets out to discriminate, but because to them the "normal" user always looks exactly the same. The result is a world that's biased by design. -the Editor
Are you trustworthy? For centuries this was a qualitative question, but no longer. Now you have a number, a score, that everybody from loan officers to landlords will use to determine how much they should trust you.
Credit scores are often presented as objective and neutral, but they have a long history of prejudice. Most changes in how credit scores are calculated over the years—including the shift from human assessment to computer calculations, and most recently to artificial intelligence—have come out of a desire to make the scores more equitable, but credit companies have failed to remove bias, on the basis of race or gender, for example, from their system.
More recently, credit companies have started to use machine learning and offer "alternative credit" as a way to reduce bias in credit scores. The idea is to use data that isn't normally included in a credit score to try and get a sense for how trustworthy someone might be. All data is potential credit data, these companies argue, which could include everything from your sexual orientation to your political beliefs, and even what high school you went to.
But introducing this "non-traditional" information to credit scores runs the risk of making them even more biased than they already are, eroding nearly 150 years of effort to eliminate unfairness in the system.
How credit scores evolved from humans to AI
In the 1800s, credit was determined by humans—mostly white, middle-class men—who were hired to go around and inquire about just how trustworthy a person really was. "The reporter's task was to determine the credit worthiness of individuals, necessitating often a good deal of snooping into the private and business lives of local merchants," wrote historian David A. Gerber.
These reporters' notes revealed their often racist biases. After the Civil War, for instance, a Georgia-based credit reporter called a liquor store named A.G. Marks' liquor "a low Negro shop." One reporter from Buffalo, N.Y. wrote in the 1880s that "prudence in large transactions with all Jews should be used."
Early credit companies knew that impressionistic records were biased, and introduced a more quantitative score to try and combat the prejudices of credit reporters. In 1935, for example, the Federal Home Owners' Loan Corporation created a map of Atlanta, showing neighborhoods where mortgage lending was "best," coded in green, compared to "hazardous," coded in red.
This solution, it turned out, codified the discrimination against minorities by credit companies. Neighborhoods coded red were almost exclusively those occupied by racial minorities. These scores contributed to what's called "redlining," a systematic refusal by banks to make loans or locate branches in these "hazardous" areas.
The FICO score, the three-digit number most of us associate with credit scores today, was one of the biggest attempts at fixing the bias in the credit system. Introduced in 1989 by data analytics company FICO (known as Fair, Isaac, and Company at the time), the FICO score relies on data from your bank, such as how much you owe, how promptly you pay your bills, and the types of credit you use.
"By removing bias from the lending process, FICO has helped millions of people get the credit they deserve," it says on its website.
Today, there are laws meant to prevent discrimination in credit scores on the basis of race, color, religion, national origin, sex, marital status, or age. But the reality is less rosy. For decades, banks have targeted historically redlined communities with predatory mortgages and loans, which many communities have yet to recover from.
Experts estimate that the higher rates of foreclosure on predatory mortgages wiped out nearly $400 billion in communities of color between 2009 and 2012. The companies that buy up debts and take people to court target people of color more than any other group.
Study after study shows that credit scores in communities that are mostly occupied by people of color are far lower than those nearby occupied by white people. A 2007 study done by the Federal Reserve Board found that the mean score of Blacks in the United States was half that of white people.
How algorithms can bring down credit scores
Against this backdrop, banks are turning to algorithms and machine learning to try and "innovate" on the ways credit scores are determined.
Though it's unclear exactly what sorts of algorithms credit companies use, the latest in machine learning is "deep learning." Deep-learning programs are trained on huge amounts of data in order to "learn" patterns and generate an output, such as a credit score. Machine learning depends on a quality training dataset for accuracy, meaning that AI programs can absorb the prejudices of their creators. For example, Amazon ditched an AI-driven hiring tool trained on resumes after it realized that the program was biased against women.
Financial technology ("fintech") startups are feeding non-traditional data into their algorithms, which take those inputs and generate a credit score. Companies such as ZestFinance, Lenddo, SAS, Equifax, and Kreditech are selling their AI-powered systems to banks and other companies, to use for their own creditworthiness decisions. (Equifax and Lenddo did not respond to a request for comment.)
LenddoEFL, for example, offers a Lenddo Score that "complements traditional underwriting tools, like credit scores, because it relies exclusively on non-traditional data derived from a customer's social data and online behavior." Lenddo even offers an option to allow creditors to install the Lenddo app onto their phones that can analyze what is typed into a search bar.
"Your zip code alone can tell a bank how likely it is that you're white"
In return, customers are offered quick decisions and the illusion of agency. If everything you do informs your credit score, then a person might start thinking that if they just search for "good" things on Google, check in at the "right" places on Facebook, and connect with the right people on social media, you can become lendable.
"It suggests in some ways, that a person could control their behavior and make themselves more lendable," said Tamara K. Nopper, who has done research into alternative data and credit.
In reality, these systems are likely noticing and interpreting signals that customers might not realize: Your zip code alone, in many cases, can tell a bank how likely it is that you're white. If you went to a historically Black college or university, that data could be used against you. If you use eHarmony, you might get a different credit score than if you use Grindr.
One study from last year on using so-called "digital footprints" to generate credit scores found that iPhone users were more likely to pay loans back than Android users. "The simple act of accessing or registering on a webpage leaves valuable information," the authors wrote.
The algorithms likely wouldn't be told the race or gender of an applicant, but that doesn't prevent the system from making guesses and reflecting existing biases. A 2018 study found that both "face-to-face and FinTech lenders charge Latinx/African-American borrowers 6-9 basis points higher interest rates."
Researchers have previously raised the alarm that AI programs could effectively reinstate redlining and similar practices by churning through deeply biased or incomplete data to produce a seemingly objective number. In 2017, Stanford University researchers found that even an ostensibly "fair" algorithm in a pretrial setting can be injected with bias in favour of a particular group, depending on the composition of the training data.
Last year even FICO recognized that an over-reliance on machine learning "can actually obscure risks and shortchange consumers by picking up harmful biases and behaving counterintuitively."
Some fintech companies also recognize the dangers. Douglas Merrill, CEO of ZestFinance, says that its tools are supposed to root out hidden biases, not amplify them. It recently released a tool designed to reward fairness in its models. And Merrill says that ZestFinance is not going to start plugging your tweets into its algorithms any time soon. "Social media data? It doesn't work in credit, and it's just plain creepy," he said.
Why AI bias is so hard to fix
Biases in AI can affect not just individuals with credit scores, but those without any credit at all as non-traditional data points are used to try and invite new creditors in.
There is still a whole swath of people in the United States known as the "unbanked" or "credit invisibles." They have too little credit history to generate a traditional credit score, which makes it challenging for them to get loans, apartments, and sometimes even jobs.
According to a 2015 Consumer Financial Protection Bureau study, 45 million Americans fall into the category of credit invisible or unscoreable—that's almost 20 percent of the adult population. And here again we can see a racial divide: 27 percent of Black and Hispanic adults are credit invisible or unscoreable, compared to just 16 percent of white adults.
To bring these "invisible" consumers into the credit score fold, companies have proposed alternative credit. FICO recently released FICO XD, which includes payment data from TV or cable accounts, utilities, cell phones, and landlines. Other companies have proposed social media posts, job history, educational history, and even restaurant reviews or business check-ins.
Lenders say that alternative data is a benefit to those who have been discriminated against and excluded from banking. No credit? Bad credit? That doesn't mean you're not trustworthy, they say, and we can mine your alternative data and give you a loan anyway.
But critics say that alternative data looks a lot like old-school surveillance. Letting a company have access to everything from your phone records to your search history means giving up all kinds of sensitive data in the name of credit.
“Coming out of the shadows also means becoming more visible and more trackable,” Nopper told me when I reported on alternative credit for Slate. “That becomes an interesting question about what banking the unbanked immigrant will mean for issues of surveillance when more and more activities are documented and tracked.”
Experts worry that the push to use alternative data might lead, once again, to a situation similar to the subprime mortgage crisis if marginalized communities are offered predatory loans that wind up tanking their credit scores and economic stability.
"If a borrower's application or pricing is based, in part, on the creditworthiness of her social circles, that data can lead to clear discrimination against minorities compared to white borrowers with the same credit scores," wrote Lauren Saunders, the associate director of the National Consumer Law Center, in a 2015 letter to the U.S. Department of the Treasury expressing concerns about these tactics.
Just this week on Twitter, Sen. Elizabeth Warren demanded to know what federal government financial institutions "are doing to ensure lending algorithms are fair and non-discriminatory.”
In other words, in trying to reduce the number of "credit invisible" people out there the banking industry might have created an even bigger problem.
So far there have been no high-profile cases brought to trial that allege discrimination based on alternative credit. The methods here are new, and they often target folks who don't necessarily have the time and money to pursue legal options even if they do feel discriminated against (remember, alternative credit is often aimed at those who don't even have a bank account).
In the United States, banks and businesses legally must be able to explain why an “adverse credit decision” was made—why someone wasn’t offered a loan or a line of credit. For companies who want to use machine learning, this can be a challenge because AI systems often make connections between pieces of data that they can’t necessarily explain.
"Credit-scoring tools that integrate thousands of data points, most of which are collected without consumer knowledge, create serious problems of transparency," wrote the authors of a recent study on big data and credit. "Consumers have limited ability to identify and contest unfair credit decisions, and little chance to understand what steps they should take to improve their credit."
Artificial intelligence programs basically put data in a blender and the resulting milkshake is the number they produce. It can be very hard, if not impossible, for machine-learning experts to pick apart a program's decision after the fact. This is often referred to as the "explainability" problem, and researchers are currently working on methods for human scientists to peek under the hood and see how these programs make decisions.
Nopper wonders if there's another way. Even as politicians question the algorithms around credit scores, they’re not making arguments for the end of credit. "There’s not necessarily this call to end marketplace lending, just a call to regulate it,” Nopper said. "How did these institutions become so pervasive in our imagination that we can't think of true alternatives?"
Nopper points to the campaign for public banking in New York City. Rather than allowing private companies to use their algorithms and surveillance systems to determine credit, could there be a publicly run bank with the explicit purpose of serving the community, not their shareholders?
If banks aren't set up to maximize their own profits, they might take a different tack when it comes to credit, one that is less open to systemic bias.
Update, July 1, 2019: The story has been updated to include a comment from ZestFinance.