Morgan is a horror movie about an artificial child that quickly runs out of control and causes much death and destruction. With that outcome, producers at Fox nonetheless thought it might be a good idea to get a real artificial intelligence involved in preparing it for release (the movie came out on Friday). So they asked a research team at IBM to leverage their multitasking AI, Watson, to pick the scenes for a movie trailer.
Watson did so with gusto, astounding even the developers of the learning software. This could open doors to other uses of video processing, and take the work out of sorting through large amounts of video data, including what's generated by police-worn body cameras.
As for the trailer, "this was very much an experimental investigation. We didn't know a priori how this was going to turn out," said John Smith, lead of the IBM research team in Yorktown Heights, New York, in phone call. "It was the first time we went into a very specific genre, like horror. In the end, it was quite surprising and we were very pleased with what we learned from this."
Morgan. IBM Creates First Movie Trailer By AI. Video: 20th Century Fox/YouTube
The team used systems that they had already developed, known as the visual recognition services part of the IBM Watson Developer Cloud, which allows companies tap into the abilities of this particular AI. Watson already understood how to process video footage, and can interpret emotional content. All the team needed to do was teach Watson about the horror genre, specifically.
After feeding the AI roughly 100 horror movies from various eras (including the 1976 classic The Omen), it didn't take much for the software to begin picking out patterns.
"The approach that we took here was very much machine learning, deep learning statistical analysis," said Smith. The AI was trained on a series of inputs (horror movies) to learn to recognize features. Watson could then label emotions that were present in a scene, without human intervention.
The AI can successfully tell what's in a scene because emotions can be quantified in two dimensions: arousal and valence. Each emotion falls on a Cartesian plane where valence is measured as negative or positive. The second axis, arousal, is the level of agitation: Is the emotion soothing or exciting?
"For example, 'joy' can be considered high arousal and high valence, 'sad' is low arousal and low valence, 'fear' his high arousal and low valence," Smith elaborated in a follow-up email. "When applied to a new movie, the computer sees patterns in this arousal vs. valence space along with the extracted visual object and place categories, which allows it to pick the statistically dominant scenes."
Researchers initially tried to get Watson to recognize tropes of the genre: classic scenes of a mysterious figure hiding from the character who's on the phone, for example, or the car that won't start. But what the AI picked up on was emotional content, like fear and tenderness.
"When we looked at the patterns of those emotional-type signals in trailers," said Smith, "we could actually see that they were dominating."
They ran the entirety of Morgan through this processing. Watson chose ten scenes to send to the human editor, who eventually cut the trailer together.
"In the end, [the editor] used nine out of the ten [clips] and did some very nice work." The editor fine-tuned the boundaries of the clip, inserted transitions and added the music score, according to Smith. "So, that was a completely human process," he said.
With advances in video processing, the tech is also applicable to large collections of video footage, like police-worn body cameras. A current problem with the use of these cameras is the gargantuan task of sorting through hours of mundane video. When I brought up that connection, Smith told me that IBM is in the midst of working on a solution to that problem, too.
"A lot of the same components and tools that we used here for Morgan, we're also applying to body-worn cameras for law enforcement. It's a big area, 12 million police officers worldwide, and many of them will be wearing body-worn cameras. That's a huge amount of video, so we're already going down that path," he said.