With only 30 minutes of audio, companies can now create a digital clone of your voice and make it say words you never said.
Using machine learning, voice AI companies like VocaliD can create synthetic voices from a person's recorded speech—adopting unique qualities like speaking rhythm, pronunciation of consonants and vowels, and intonation.
For tech companies, the ability to generate any sentence with a realistic-sounding human voice is an exciting, cost-saving frontier. But for the voice actors whose recordings form the foundation of text-to-speech (TTS) voices, this technology threatens to disrupt their livelihoods, raising questions about fair compensation and human agency in the age of AI.
At the center of this reckoning is voice actress Bev Standing, who is suing TikTok after alleging the company used her voice for its text-to-speech feature without compensation or consent. This is not the first case like this; voice actress Susan Bennett discovered that audio she recorded for another company was repurposed to be the voice of Siri after Apple launched the feature in 2011. She was paid for the initial recording session but not for being Siri. Rallying behind Standing, voice actors donated to a GoFundMe that has raised nearly $7,000 towards her legal expenses and posted TikTok videos under the #StandingWithBev hashtag warning users about the feature.
Standing's supporters say the TikTok lawsuit is not just about Standing's voice—it's about the future of an entire industry attempting to adapt to new advancements in the field of machine learning.
“I do fear that if she loses, or if she otherwise has to drop the case for some reason, that that could set a precedent that companies are allowed to just use our voices as they please,” voice actor Calvin Joyal told Motherboard.
Losing control over the use of one’s voice can be detrimental to a voice actor’s career. If their synthetic double voices commercials for a client’s competitors, they could be in violation of contracts requiring exclusivity over their voice. If it voices ideas against their brand or beliefs, they could damage their reputation with clients. If it is resold for a different use, they lose potential income. Misuse of recordings is a risk across all voice acting jobs, but text-to-speech and AI present new and more pervasive methods for performers to lose control over what they say—and who they say it for.
Standing’s case materializes some performers’ worst fears about the control this technology gives companies over their voices. Her lawsuit claims TikTok did not pay or notify her to use her likeness for its text-to-speech feature, and that some videos using it voiced “foul and offensive language” causing “irreparable harm” to her reputation. Brands advertising on TikTok also had the text-to-speech voice at their disposal, meaning her voice could be used for explicitly commercial purposes.
Voice actors can fight for specific protections against reuse or automation of their voice when they sign contracts with new clients, but Maria Pendolino said recent shifts in the industry present new challenges to negotiation. Pendolino, a voice actress since 2010 who has presented on contract negotiation at industry conferences, said there has been an increased demand for voice acting work. She said clients who are new to the industry often demand “a ton of boilerplate usage,” including the ability to use files “in perpetuity” and for new uses not initially discussed.
Pendolino worries about performers unknowingly signing their voices away—and how easy it is for them to do so. While agents and managers negotiated contracts for talent roughly 15 years ago, today voice actors often market themselves to clients directly. With a credit card and microphone, anyone can audition on “pay-to-play” sites, the digital marketplaces where many voice actors find work. Without industry knowledge, some text-to-speech and AI jobs may seem like better deals than they are, particularly to newer voice actors eager to find work.
“If you don't have that knowledge you could walk into a very murky landscape, be wildly underpaid for your work, be severely taken advantage of, and find out two years later that your voice is everywhere,” Pendolino told Motherboard.
Jim Kennelly, the owner of a recording studio specializing in voiceovers, said he has seen an agreement like this before. Kennelly recalled a woman who approached his studio, Lotas Productions, to record her voice for a gaming app company. She recorded for roughly four hours, received $25,000, and in return the company could use her voice for any character across their games—forever.
“For that talent, to receive a fee of $25,000, she was happy. Now, personally, we would tell someone that's a bad deal,” Kennelly told Motherboard. “Even though $25,000 is a lot of money, the fact that they can just cook you and recreate you over and over again isn't really in your interest.”
Rate guides developed by organizations like the Global Voice Acting Academy (GVAA) help performers determine what they should charge based on the usage of their recordings. But GVAA CEO David Rosenthal said an industry standard for AI is not set, and rates for this work fluctuate significantly as the technology improves.
“The problem is that with AI and TTS, we're in the wild, wild west right now,” Rosenthal told Motherboard.
Laws protecting individuals from unauthorized clones of their voices are also in their infancy. Standing’s lawsuit invokes her right of publicity, which grants individuals the right to control commercial uses of their likeness, including their voice. In November 2020, New York became the first state to apply this right to digital replicas after years of advocacy from SAG-AFTRA, a performers’ union.
“We look to make sure that state rights of publicity are as strong as they can be, that any limitations on people being able to protect their image and voice are very narrowly drawn on first amendment lines,” Jeffrey Bennett, a general counsel for SAG-AFTRA, told Motherboard. “We look at this as a potentially great right of publicity case for this voice professional whose voice is being used in a commercial manner without her consent.”
Though rates and laws around voice cloning are still forming, Kennelly’s studio is preparing for an industry transition by developing an AI division. He predicts this technology will create more work for voice actors: performers will license their synthetic voices to monetize themselves outside the studio, and companies will hire performers to develop specialized AI brand voices unlike the “generic” standard set by Apple, Amazon, and Google.
Voice actor Mike DelGaudio agreed the future of licensing personas is “absolutely coming” and sees its potential to scale his business. But DelGaudio said he wants laws around the misuse of one’s likeness to adapt to voice deepfakes, and sees Standing’s case as a “watershed moment” for the industry.
“Our voice is somewhat immutable,” DelGaudio told Motherboard. “If my voice gets licensed out and used and the reputation of that voice becomes spoiled so that you can’t earn from it anymore, I can’t go back and just invent a new voice for myself.”