The bank thought it was talking to me; the AI-generated voice certainly sounded the same.
On Wednesday, I phoned my bank’s automated service line. To start, the bank asked me to say in my own words why I was calling. Rather than speak out loud, I clicked a file on my nearby laptop to play a sound clip: “check my balance,” my voice said. But this wasn't actually my voice. It was a synthetic clone I had made using readily available artificial intelligence technology.
“Okay,” the bank replied. It then asked me to enter or say my date of birth as the first piece of authentication. After typing that in, the bank said “please say, ‘my voice is my password.’”
Again, I played a sound file from my computer. “My voice is my password,” the voice said. The bank's security system spent a few seconds authenticating the voice.
“Thank you,” the bank said. I was in.
I couldn’t believe it—it had worked. I had used an AI-powered replica of a voice to break into a bank account. After that, I had access to the account information, including balances and a list of recent transactions and transfers.
Banks across the U.S. and Europe use this sort of voice verification to let customers log into their account over the phone. Some banks tout voice identification as equivalent to a fingerprint, a secure and convenient way for users to interact with their bank. But this experiment shatters the idea that voice-based biometric security provides foolproof protection in a world where anyone can now generate synthetic voices for cheap or sometimes at no cost. I used a free voice creation service from ElevenLabs, an AI-voice company.
Now, abuse of AI-voices can extend to fraud and hacking. Some experts I spoke to after doing this experiment are now calling for banks to ditch voice authentication altogether, although real-world abuse at this time could be rare.
Rachel Tobac, CEO of social engineering focused firm SocialProof Security, told Motherboard “I recommend all organizations leveraging voice ‘authentication’ switch to a secure method of identity verification, like multi-factor authentication, ASAP.” This sort of voice replication can be “completed without ever needing to interact with the person in real life.”
Online trolls have already used ElevenLabs to make replica voices of people without their consent, using clips of the peoples’ voices online. Potentially anyone with even a few minutes of their voice publicly available—YouTubers, social media influencers, politicians, journalists—could be susceptible to this sort of voice cloning.
I performed the test on an account with Lloyds Bank in the UK. On its website, Lloyds Bank says its “Voice ID” program is safe. “Your voice is like your fingerprint and unique to you,” the site reads. “Voice ID analyses over 100 different characteristics of your voice which like your fingerprint, are unique to you. Such as, how you use your mouth and vocal chords, your accent and how fast you talk. It even recognises you if you have a cold or a sore throat,” it adds.
Plenty of banks in the U.S. offer similar voice verification services. TD Bank has one called “VoicePrint,” and says on its website “Your voiceprint, like your fingerprint, is unique to you—no one else has a voice just like you.” Chase has “Voice ID” which, like Lloyds Bank, also claims a customer’s voiceprint “is created from more than 100 different physical and behavioral characteristics.” Wells Fargo’s “Voice Verification,” meanwhile, “effectively protects your identity,” according to the bank’s website.
Although I only conducted the test on Lloyds Bank, given the similar nature and functioning of these other systems, they may be at risk to AI-powered voices too. Many banks allow users to do a host of banking features over the phone, such as checking transaction history, account balances, and in some cases transferring funds.
For this particular attack, a fraudster would also need the target’s date of birth. But thanks to a plethora of data breaches, brokers, or people sharing personal details online, a date of birth is often readily available.
A Lloyds Bank spokesperson said in a statement that “Voice ID is an optional security measure, however we are confident that it provides higher levels of security than traditional knowledge-based authentication methods, and that our layered approach to security and fraud prevention continues to provide the right level of protection for customers' accounts, while still making them easy to access when needed.”
Lloyds Bank said it is aware of the threat of synthetic voices and deploying countermeasures, but has not seen a case where such a voice has been used to commit fraud against its customers. Synthetic voices are not as attractive to fraudsters as other much more common methods, and voice ID has led to a significant dip in fraud with phone banking, Lloyds Bank said.
Given how rare synthetic voice fraud is at the moment, consumers are likely better placed using it if it means protecting them from other sorts of fraud, such as phishing. That calculus might change if the consumer is a public figure, with lots of high-quality audio of their voice readily available on the internet.
TD Bank, Chase, and Wells Fargo did not respond to a request for comment on whether they are aware of AI-powered voices being used to target customer accounts, and what mitigations, if any, they are taking to stop the threat. In September, lawyers sued a group of U.S. financial institutions because biometric voice prints used to identify callers violate the California Invasion of Privacy Act.
The Consumer Financial Protection Bureau, one of the U.S. agencies that regulates the financial industry, told me in a statement after I sent the video demonstration “The CFPB is concerned with data security, and companies are on notice that they’ll be held accountable for shoddy practices. We expect that any firm follow the law, regardless of technology used.”
Do you know anything else about bank voice ID, or how AI voices are being abused? We'd love to hear from you. Using a non-work phone or computer, you can contact Joseph Cox securely on Signal on +44 20 8133 5190, Wickr on josephcox, or email firstname.lastname@example.org.
Over the last few weeks I have tested a few AI-voice generation services. Most of them had problems or limitations with recreating my British accent, which would be necessary to access the Lloyds Bank account. Eventually I used ElevenLabs, which handled the accent well.
To create the voice, I recorded about five minutes of speech and uploaded it to ElevenLabs (for the audio clips, I read sections of Europe’s data protection law). A short while later, the synthetic voice was ready to use, with it saying whatever text was entered into ElevenLabs’ site.
The experiment of entering the bank account failed multiple times, with Lloyds Bank’s system saying it could not authenticate the voice. After making some tweaks on ElevenLabs, such as having it read a longer body of text to make cadences sound more natural, the generated audio successfully bypassed the bank’s security.
On its website ElevenLabs says its use cases include providing voices for newsletters, books, and videos. But with minimal guardrails in place at launch, people quickly abused ElevenLabs’ technology. Members of 4chan used ElevenLabs to make synthetic versions of celebrities spout racist and transphobic things, such as a fake Emma Watson reading Mein Kampf. Later, trolls used AI-voice generators to make replicas of specific voice actors, and then had them read out the actors’ home addresses in posts on Twitter (the attackers claimed ElevenLabs’ technology was used as part of the dox, but ElevenLabs claimed only one other clip, which did not include the target’s addresses, was made with its software).
After the celebrity clips, ElevenLabs tweeted to ask what safeguards it should put in place, such as asking for full ID identification of users or requiring payment information. Motherboard, however, was able to generate the voice without providing ID or any payment information, potentially because the account was made before ElevenLabs introduced new security measures. The cost of creating the bank security bypassing voice was free.
ElevenLabs did not respond to multiple requests for comment. In a previous statement, Mati Staniszewski, an ex-Palantir deployment strategist and now co-founder of ElevenLabs, said “Our new safeguards are already rapidly reducing instances of misuse and we're grateful to our user community for continuing to flag any examples where extra action needs to be taken and we will support authorities in identifying those users if the law was broken.”
Update: This piece has been updated with a statement from the Consumer Financial Protection Bureau. It has also been updated to correct that ElevenLabs’ technology was not used to read the dox of voice actors, but was used by the same attackers.
Subscribe to our cybersecurity podcast, CYBER. Subscribe to our new Twitch channel.