Watch an Algorithm Turn Winter Into Summer in Any Video
Nvidia researchers developed a way to turn snowy roads into summer, and day into night.
Image: Ming-Yu Liu
Machine-learning algorithms are getting good at creating fake images that look real. In a study presented this week at the Conference on Neural Information Processing Systems, researchers at Nvidia, a company known best for its graphics processing cards and lately, for its self-driving car technology, developed an algorithm for image translation that can alter the weather or time of day in a video.
Image translation is a machine learning method for inputting one image and outputting the same image with different attributes. For example, feeding an algorithm a video of a snowy road, and transforming it into a lush summer scene—as the Nvidia researchers have done:
It’s similar to the Pix2Pix project, which gave us some pretty messed-up algorithmic faces back in June. They use variational autoencoders (VAEs) and generative adversarial networks (GANs) to build a framework for the algorithm to learn on. As the Two Minute Papers YouTube channel explained in its summary of the study, this is when a generator network creates synthetic images in an attempt to fool another network, which is also learning how to spot fake images versus real ones. It’s sort of a push-and-pull between these networks, where one is trying to get better at spoofing images, and the other is trying to improve its ability to spot spoofs.
Lead researcher Ming-Yu Liu told me in an email that the purpose of this work was to give machines the ability to create or “imagine” scenes on their own: "This is a difficult challenge, because most AI today requires you to have images (training data) that exactly correspond for both the input and target image."
This kind of "imagination" in a self-driving car could speed up training in a variety of environments: Being able to run a simulation of a snowy road, for example, instead of having to drive a car repeatedly in a snowy environment.
The Nvidia researchers used six of these networks in their experiments. The results are fairly believable, if you don’t look too closely at them. If you told me the output was a real video taken months later with potato-quality dashboard camera, I’d buy it.