This App Claims It Can Detect 'Trustworthiness.' It Can't

“Determine how trustworthy a person is in just one minute.” That’s the pitch from DeepScore, a Tokyo-based company that spent last week marketing its facial and voice recognition app to potential customers and investors at CES 2021.

Here’s how it works: A person—seeking a business loan or coverage for health insurance, perhaps—looks into their phone camera and answers a short series of questions. Where do you live? How do you intend to use the money? Do you have a history of cancer? DeepScore analyzes the muscular twitches in their face and the changes in their voice and delivers a verdict to the lender or insurer. This person is trustworthy, this person is probably not.

Videos by VICE

Since its founding in 2019, DeepScore has attracted thousands of users in the money lending and health insurance industries, primarily in Japan, Indonesia, Vietnam, and the Philippines, according to its CEO, Shirabe Ogino. It aims to revolutionize credit scoring, particularly in emerging markets and countries where industry giants like FICO and Zhima Credit struggle to operate because most residents don’t have detailed documentation of their debt, spending, or identity. Ogino did not disclose any specific companies that use the service.

“If you don’t have any kind of information available to your lenders, you just don’t get any money at all,” Ogino told Motherboard. “People have freedom not to answer questions or not to use our engine, but they just can’t get money.”

Privacy and human rights advocates are alarmed by DeepScore’s premise—that the minute signals captured by facial and vocal recognition algorithms reliably correspond to something as subjective and varied as a person’s honesty.

“The serious concern I have about this kind of technology is that there is simply no reliable science to indicate that these peoples’ facial expressions or the inflections of their voice are proxies for their internal mental and emotional states,” Amos Toh, a senior researcher studying artificial intelligence for Human Rights Watch, told Motherboard.

There has been some research that found connections between facial expressions, such as different kinds of smiles, and dishonesty. In a slide presentation shared with Motherboard, DeepScore states that there are more than 200 research papers on “micro-movement and stress” that underlie its technology. It provides a list of 18 of those papers. Many of the papers show correlation, but not necessarily reliable truths. Other research, meanwhile, indicates that facial expressions are poor indicators of mental state and vary significantly by culture. On top of that, facial recognition technology has consistently been shown to be less accurate for women and people with darker skin tones.

Ogino said that the company calibrates its algorithms for each entity to account for the physical variations between their customer bases. He also stressed that lenders and insurers only use DeepScore as one part of their decision-making process. “If you apply for $5 million while your income is just $1, nobody will lend you money. So they use other information as well,” he said.

In a promotional slide deck and on its website, DeepScore says it can detect deception with 70 percent accuracy and a 30 percent false negative rate. The test was conducted on a cohort composed equally of people telling lies and truths, according to DeepScore’s website. The false negative rate and opportunity for biased results are particularly concerning to AI researchers.

Dr. Rumman Chowdhury, founder of Parity AI, an algorithmic bias auditing platform, told Motherboard the app is at minimum likely to discriminate against people with tics, anxiety, or who are neuroatypical. It’s unlikely the studies DeepScore cited as proof of concept involved participants from the specific unbanked communities in Asia the company is targeting, she said, and many of them appear to address the correlation between facial movements and stress, not lying, which raises further questions about their relevance.

“You can say that, in aggregate, there are some general trends in human beings when they are lying,” Chowdhury said. “It doesn’t disaggregate to an individual human being. I might touch my nose because it’s a nervous tic that I’ve developed. It doesn’t mean that I’m lying.”

DeepScore is purposefully targeting markets where information about customers is hard to come by. Ogino said “any algorithm is not 100 percent accurate,” but the goal is to bring loan opportunities and insurance to people who would otherwise be excluded because businesses have no way to determine whether they’re low- or high-risk. Its possible uses involve life-or-death decisions.

One slide in the company’s pitch deck describes a use case for insurers: to detect “if [an applicant is] diagnosed with cancer but hiding it since it’s at an early stage.” Should the app detect that the applicant replied dishonestly to one of 10 questions, it could trigger “fee increases or additional examinations.

A slide presentation from DeepScore, showing how the app scores prospective borrowers based on voice patterns and facial movements.

The privacy implications of DeepScore are vast. Ioannis Kouvakas, a legal officer for Privacy International, reviewed the company’s slide deck and told Motherboard he didn’t believe it would be able to legally operate in the European Union due to the bloc’s General Data Protection Regulation (GDPR). Some of the countries in which DeepScore says it has active customers, such as Indonesia and Vietnam, do not yet have comprehensive data protection laws in place.

DeepScore doesn’t have a privacy policy listed on its website. Ogino said prospective borrowers or policy seekers can choose not to use the service or find another institution to work with.

“It’s very easy to claim that you rely on consent, but there is a very unfair balance of power,” Kouvakas said. “It’s really hard to say no to the person deciding whether you’re getting [your money] or not next month.”