AI Can Manipulate Video to Make Everybody Dance Now

August 24, 2018, 11:15am

Algorithms are getting good at churning out video that convincingly shows a real person doing something they’ve never done in reality.

In a paper posted to the arXiv preprint server this week, researchers at the University of California Berkeley demonstrate how they designed AI that, given video of an expert dancer and an amateur, can transfer the moves from one to the other and create convincing video of the amateur pulling off some seriously impressive rug-cutting. But that’s not all.

Videos by VICE

“With our framework,” the researchers wrote, “we create a variety of videos, enabling untrained amateurs to spin and twirl like ballerinas, perform martial arts kicks or dance as vibrantly as pop stars.”

The paper is called “Everybody Dance Now,” which is charming because it conjures ideas like recreating the entire “Evolution of Dance” viral video with algorithms and minimal physical effort. But it also represents the latest step forward in creating highly realistic video that can put people in situations that they were never really in—so far, AI has been able to recreate voices, faces, and now great dancing.

These concerns aside, it really is some impressive tech. The researchers report that they first used an algorithm to detect poses from a source video of a dancer—basically creating a moving stick figure. They also took video of the target person performing a range of motion. Then, they trained two deep learning algorithms known as Generative Adversarial Networks (GAN) to produce a whole image and to produce a sharper and more realistic face for the amateur subject. The end result is a system that generates video and convincingly maps the bodily motion of an expert dancer to a total amateur.

The system isn’t perfect, the researchers admit. The resulting video is convincing, but there are occasional aberrations like stuttering, disappearing body parts, things looking “melty,” and so on, that break the illusion. In fact, the amateur target sometimes had to be filmed hitting specific poses in the source dance video to make the final result more convincing. These inconsistencies, the authors write, could have something to do with current limitations in pose detection and differences between body types.

Fun? Mortifying? These days, it’s hard to tell which way to feel first when it comes to impressive new tech for making people do things they never could—or possibly, would—otherwise.

Get six of our favorite Motherboard stories every day by signing up for our newsletter.