After 20 Minutes of Listening, New Adobe Tool Can Make You Say Anything
Adobe promises never to abuse it as they use to abuse their host.
Hell no. Image: Adobe Creative Cloud/ YouTube
Ever wanted to make Natalie Portman yell obscenities at your neighbors? What if Garey Busey could leave your mother a sexy voicemail on her birthday? Wanted to prank your little brother by forcing him to call his crush and profess his love? Adobe has you covered.
When Adobe released photoshop in 1990, it dreamed of a world where movie studios and photo editors could do in minutes what once took hours. It never dreamed the world would take the digital editor and use it to put celebrity heads on porn star bodies, distort women's bodies in magazine cover, and create vile memes.
Now, the same company that gave the world Photoshop wants to do for the human voice what it did for the human image—give people the tools to warp it in anyway they see fit. At the Adobe Max Creativity Conference, the company premiered VoCo: an audio editing suite that will allow users to make people say whatever they want just by typing.
According to Adobe, after about 20 minutes of listening to a voice, users can make the voice say whatever they want just by typing it out. Comedian and director Jordan Peele hosted the event and Adobe tech Zeyu Jin demoed the process by editing an interview with Peele's comedic partner Keegan-Michael Key. Jin took existing audio of Key, then used the software to make him talk about making out with Peele instead of his wife.
In the audio clip, Key expresses his excitement about an award nomination. "I jumped on the bed," Key said. "And I, uh, kissed my dog and my wife … in that order." The screen above Jin's head displayed an audio waveform and a small box with the transcribed sentence.
Jin erased portions of the sentence and typed out new phrases and the waveform changed to match. In seconds, the Adobe tech had made Key say he'd kissed his wife before his dog, then kissed Jordan, then kissed Jordan three times. Jin did all this just by typing. It sounded seamless.
"Don't worry," Jin said. "We actually have researched how to prevent forgery. Think about watermarking detection. As we're getting the results much better, making it so people can't distinguish between the fake and the real one, we're working harder trying to make it detectable." He then gave a thumbs up and grinned.
Jin said Adobe developed the software to help podcasters and audio book editors. Typing out the new audio instead or re-recording would be a blessing in both professions. But Adobe knows this can be used to make people say things they didn't say. Hell, the first thing it did to demo the technology was mess with the host of its own conference.