This story is over 5 years old.


Video Games Are So Realistic That They Can Teach AI What the World Looks Like

And they're only getting better.
Deus Ex: Mankind Divided. Image: Eidos Montreal

Thanks to the modern gaming industry, we can now spend our evenings wandering around photorealistic game worlds, like the post-apocalyptic Boston of Fallout 4 or Grand Theft Auto V's Los Santos, instead of doing things like "seeing people" and "engaging in human interaction of any kind."

Games these days are so realistic, in fact, that artificial intelligence researchers are using them to teach computers how to recognize objects in real life. Not only that, but commercial video games could kick artificial intelligence research into high gear by dramatically lessening the time and money required to train AI.


"If you go back to the original Doom, the walls all look exactly the same and it's very easy to predict what a wall looks like, given that data," said Mark Schmidt, a computer science professor at the University of British Columbia (UBC). "But if you go into the real world, where every wall looks different, it might not work anymore."

Read More: Why Artificial Intelligence Researchers Love 'Super Mario Bros.'

Schmidt works with machine learning, a technique that allows computers to "train" on a large set of labelled data—photographs of streets, for example—so that when let loose in the real world, they can recognize, or "predict," what they're looking at. Schmidt and Alireza Shafaei, a PhD student at UBC, recently studied Grand Theft Auto V and found that self-learning software trained on images from the game performed just as well, and in some cases even better, than software trained on real photos from publicly available datasets. "Video game graphics have actually gotten good enough that you can train on raw data and have it be almost as good as real-world data," Schmidt continued. Of course, video games aren't advanced enough to be indistinguishable from reality, and so real images are still preferred. But you can cull so many labelled images from games that their sheer number makes up for the lack of detail in individual images.

Because popular machine learning databases like Cityscapes and CamVid only contain images captured in European cities, games that realistically render North American locales can also help AI broaden its horizons.


Grand Theft Auto V. Image: Rockstar Games

"For example, European streets are narrower than streets in North America," said Stephan Richter, a PhD student at Technische Universität Darmstadt in Germany, who also studies computer vision. "If you train AI on German streets and try to use it on North American streets, it might not perform as well as if you'd had the proper training data.

Video games allow researchers to create labelled images much more quickly than when they work with real photos. When presented with a photo from a real street, researchers have to manually label every object on the screen so the computer knows what it's looking at. This is a process that can take many hours, and time is money.

But video game code always "knows" what's on-screen, and so essentially the images are pre-labelled for researchers. All it takes is some software that can intercept the commands a game sends to the computer's graphics hardware, since the source code for games themselves is protected by studios.

Hitman. Image: Io-Interactive

"We realized that because the game already knows what a car is and where it appears again in another frame, then we have the label already made and we can streamline the annotation process," said Richter. "We can massively annotate other frames when we see a 3D object again."

Richter and a team of researchers from Technische Universität Darmstadt and Intel Labs recently published a paper on this approach and found that labelling a single image culled from Grand Theft Auto V took seven seconds on average. Manually labelling real-world images, on the other hand, took anywhere from an hour to 90 minutes—a massive drop in both work time and money spent paying people to do the work.

Off-the-shelf video games like Grand Theft Auto, Hitman, and the Chicago-set Watch Dogs—just a couple examples from the team's paper—offer enough realism and detail to potentially revolutionize machine learning research. Research teams may not have the time or money to manually label real-world images or generate a realistic 3D simulation of their own, and so games can step in to fill the resource gap.

But in order to make it really work, Richter said, game developers need to be open with researchers and collaborate on machine learning work.

"Games have become so realistic in terms of visual quality and the speed at which you can render photorealistic images that it's totally interesting to open up game engines for machine learning work," said Richter. "It would be interesting if game manufacturers actually let us use their worlds."

In an AI-saturated future, it might just be a selling point that a new game is so realistic that even a computer will dig it.