AI Can Now Generate Videos From Text, Courtesy of Facebook

Make-A-Video is the first time we are seeing a text-to-video tool that will soon be available to the public. 
Screenshot of a video of a teddy bear painting a self-portrait created by Make-A-Video from Meta.
Facebook / Meta

On Thursday, Facebook’s parent company Meta announced Make-A-Video, a tool that generates short video clips from text descriptions—an unsettling, albeit inevitable, next step for the world of AI image generation. 

The tool follows the company’s Make-A-Scene tool that was launched in July, which generates still images from text descriptions. While there are many comparable tools like DALL-E and Midjourney that have taken over the internet, Make-A-Video is the first time we are seeing a text-to-video tool that will soon be available to the public. 


“Generative AI research is pushing creative expression forward by giving people tools to quickly and easily create new content,” Meta’s press release said. “With just a few words or lines of text, Make-A-Video can bring imagination to life and create one-of-a-kind videos full of vivid colors, characters, and landscapes. The system can also create videos from images or take existing videos and create new ones that are similar.”

“It's much harder to generate video than photos because beyond correctly generating each pixel, the system also has to predict how they'll change over time,” Meta CEO Mark Zuckerberg wrote in a Facebook post. “Make-A-Video solves this by adding a layer of unsupervised learning that enables the system to understand motion in the physical world and apply it to traditional text-to-image generation.” 

The example videos on the Make-A-Video site show videos of “a dog wearing a Superhero outfit with red cape flying through the sky” and “a teddy bear painting a portrait.” The videos are clearly AI-generated, with a blurry, painterly quality native to AI-generated images. Yet, they nonetheless show the fast-moving progress of AI art systems, which only a few years ago were the stuff of memes and science fiction. 

Screenshot from an AI-generated video of "A dog wearing a Superhero outfit with red cape flying through the sky"

Screenshot of a video of a superhero dog created by Make-A-Video from Meta.

Meta seems to be aware of the dangers behind AI art-generating systems, and claims it is “openly sharing this generative AI research and results with the community for their feedback, and will continue to use our responsible AI framework to refine and evolve our approach to this emerging technology.” 

But according to the Make-A-Video research paper, the image models were trained using a subset of the LAION dataset, which is known for scraping unfiltered web data that produces biased results. Motherboard recently reported that within this dataset were images of ISIS executions, nonconsensual nudes, and photoshopped nudes of celebrities. Meta seems to address this issue by parsing down the original data set of over 5.8 billion images down to 2.3 billion, with the paper’s authors claiming, “We filter out sample pairs with NSFW images, toxic words in the text, images with a watermark probability larger than 0.5.” 

Meanwhile, AI ethics researchers have pushed back against the use of these large language models, warning that their sheer size creates fundamental problems of harmful bias that can not be easily solved. Even Facebook’s own researchers have admitted that their language models have a “high propensity” for producing racist and harmful results.

Sign up for Motherboard’s daily newsletter for a regular dose of our original reporting, plus behind-the-scenes content about our biggest stories.

The introduction of text-to-video as a tool for artists and creators also complicates the ongoing issue of whether or not the use of AI-generated art should be considered legitimate. In August, a man named Jason Allen won an art competition using an AI-generated image, which caused intense backlash online with artists accusing Allen of expediting the death of creative jobs. 

AI-generated images are also being removed from Shutterstock and Getty Images. Getty Images CEO Craig Peters said this was because of copyright concerns. Copyright and privacy policy have not yet been able to match the quick developments of AI-image systems, leaving many questions unanswered about who owns the images being used in AI algorithms—and if transforming those images into new images is a violation of copyright. 

Meta’s announcement follows OpenAI’s release of DALLE-2 to the public on Wednesday. OpenAI, the company that developed DALLE-2, recently removed the system’s waitlist, allowing anyone to generate images from text prompts. But even as the public gets access to more and more AI-art generating tools, some of the most fundamental ethical questions about their use remain unanswered.