DeepMind's 'StarCraft' Victory Was as Worrying as It Was Impressive
A dominant performance in StarCraft shows how much AI is advancing, but on whose behalf?
'Legacy of the Void' artwork courtesy of Blizzard
At the risk of underselling everything that preceded it, the real Oh Shit moment of DeepMind’s StarCraft II demonstration didn’t arrive until the ninth game.
To be fair, the entire event had already been brain-twisting. Among other things, DeepMind is the creator of AlphaGo, an AI that plays the deceptively complex, 2500-year-old board game Go. They were on Twitch to unveil the progress of their new project, AlphaStar. In 2016, AlphaGo finally mastered the game that had baffled its artificial intelligence forebearers and defeated multi-time Go champion Lee Sedol. AlphaStar is an attempt to build on that work and apply it to something even more complex.
Dan “Artosis” Stemkoski and Kevin “RotterdaM” van der Kooi, two staples of StarCraft II events, were there to lay the groundwork, commentate the games, and eventually look stunned. They explained that AlphaStar has been playing Blizzard’s competitive real-time strategy game, and DeepMind were ready to show off the results against human players. DeepMind’s StarCraft team members Oriol Vinyals and David Silver did their best to make what unfolded distinguishable from magic, which unless you’ve read a volume of academic papers on machine learning, was an uphill climb.
By the time that ninth game was streamed, the audience had already seen some things. First, it was revealed that an earlier iteration of AlphaStar had played a five game series against Team Liquid’s Dario "TLO" Wünsch. Professional StarCraft players specialize in one of the game’s three factions and TLO normally plays as Zerg. However, to simplify the process, AlphaStar is currently only playing as Protoss in a mirror matchup against another Protoss, so TLO played off-race as best he could. Not quite at the pro level that TLO plays Zerg, but still good enough for the top Grandmaster tier of the multiplayer ladder. AlphaStar won 5-0. The AI agent looked beatable and awkward at times, but it was more impressive than anything a bot had ever done in the game.
The system was given another week to refine itself, getting even better by DeepMind’s benchmarks, and the heavy hitters were called in to give it a real test. TLO’s teammate Grzegorz “MaNa” Komincz is a natural Protoss who has been playing the series since he was five years old. Ranking StarCraft players is not an exact science, but for reference the rating site Aligulac has MaNa as the 38th best player and 12th best Protoss in the world. The improved opponent he faced was something else entirely.
Three games in vs. MaNa, AlphaStar was again up 3-0. RotterdaM, who is a very good Protoss player himself, was visibly shook. “After game one I was still absolutely not convinced,” he said. “After game two and game three, I almost have nothing to say. Because that looked very, very good.” Before they showed the ninth game, MaNa admitted that by this point he was just playing to survive.
As their next game progressed, AlphaStar massed a huge army of stalkers, a mobile unit that has a short teleport ability called blink. “Micro” is the StarCraft term for the process of actually controlling your units, and blink micro can be used to do spectacular things keeping injured stalkers alive. MaNa played well and built up a more technical army with a large number of immortals, which are normally one of the direct answers to stalkers. After holding off AlphaStar’s initial attack, MaNa pushed his army out onto the map to counterattack. The response from AlphaStar was otherworldly.
The AI split its stalkers into squads and flanked MaNa’s army on multiple sides. Actively blinking stalkers during an engagement takes an intensive amount of actions and most humans are lucky to manage a single group. AlphaStar danced squads in and out of the range of the immortals, taking pot shots and blinking individual damaged stalkers to safety on three different fronts simultaneously. MaNa’s immortals evaporated.
He was dumbfounded. “But like 8 immortals vs blink stalkers? C’mon. If I play against any human player they’re not going to be microing these stalkers this nicely.” RotterdaM couldn’t stop grinning. “In the middle of the map at one point, there are stalkers on the left top side, on the right top side, and on the south side. It technically is impossible to micro your stalkers, you know, you can’t do that,” he said. “What we saw there, that’s not human.”
AlphaStar finished 10-0 in the prerecorded games. MaNa clawed back a single win for humanity in a live demonstration against a less experienced agent trained to use the camera in a way closer to how human players do, but it felt like a footnote. The program was certainly still beatable, but everyone had already witnessed something historic and maybe slightly unsettling.
Long before Matthew Broderick and an AI playing itself in tic-tac-toe taught everyone that nuclear war might not be that great, computers were learning how to play games. In the early 1950s, before Tennis for Two was even developed, computers were already playing things like checkers and nim. Artificial intelligence researchers have long found games to be excellent platforms to learn lessons that can be applied to other arenas, and since the early 2000s the complexity of real-time strategy games have made them a particularly interesting challenge.
StarCraft and its peers offer a whole list of unique hurdles. Unlike tic-tac-toe or Street Fighter, where both players can see everything on the board, StarCraft is a game of incomplete information. A “fog of war” obscures any area of the map that the player isn’t scouting, which means constantly working to see what the opponent is doing. The fact that the game is played out in real-time, rather than being turn based, means that unlike in Go, decisions have to be made by both players simultaneously on the fly, with the calls on exactly when to engage and when to retreat being nearly impossible to model. Differing from something like chess or Rocket League, where both sides start with identical forces, StarCraft requires the building and maintenance of an economy—the “macro” portion of the game—to construct attacking units, which can vary every time.
Solutions to issues like these can mean major steps forward for AI, but the answers don’t come easily. Things picked up considerably in 2009, when BWAPI, a programing interface for StarCraft: Brood War was released to the public. At least 3 major StarCraft AI competitions sprung up--SSCAIT, CIG, and AIIDE. There have been some impressive results, but none on the level of competing with quality human competition. At the 2015 AIIDE, Protoss player Djem5 beat 3 different bots. In 2017 at Sejong University’s CIG, Song "Stork" Byung Goo trounced 4 entries in under half an hour. (The closest example I can find to a Brood War AI beating a high level player under any circumstances is the Samsung created SAIDA, winner of the latest AIIDE, taking a very casual game off of Kim "Soulkey" Min Chul, when he was screwing around playing off-race on his personal stream.)
To tremendously oversimplify, besides the jump from Brood War to StarCraft II, the revolutionary difference between these bots and AlphaStar is that AlphaStar’s behavior isn’t, for lack of a better term, scripted. The neural network learned first by being fed replays of human games, diamond level and above. Once that base was established, DeepMind created a league where different iterations of the agents play each other over and over again for the equivalent of hundreds of years. Agents improve through reinforcement learning, and evolve into sharper new generations capable beating their predecessors.
Other than incentivizing certain behaviors in some agents to encourage variety in the league, DeepMind isn’t trying to dictate specific tactics. Some previous bots have focused solely on things like zergling rushing or teching up quickly to air units, or referenced a library of potential build orders to use. Conversely, DeepMind is just turning loose a neural net, showing it the basics, and pitting it against itself to figure out whatever actually works in practice. AlphaStar is so complicated that it’s hard to be sure why it makes any particular choice. It might spam useless move commands because it picked up the habit from analyzed human games, or it might do something we’ve never seen in human play because other versions of itself had trouble defending against it. It's hard to know the answers because—and this is a huge issue with AI research—programmers don't always know what biases or behaviors an AI like this will learn from libraries of human behavior and judgments it is fed from the start. Whatever it does, it’s never just deciding which branch to proceed down in a series of If/Then statements. In a word, AlphaStar hasn’t been programmed to play so much as it has learned.
The implications of AlphaStar’s success on the game of StarCraft II could be fascinating, especially if the team continues to expand its capabilities to play with all three races, in all six matchups, on any map. The bots are capable of challenging conventional wisdom, and unearthing tactics that humans have never explored, even in such a well-analyzed arena. They could change the approach to the game in the same way that tool-assisted speedruns have helped runners rethink what’s possible. For example, in just the few demos the public has seen, AlphaStar displayed a tendency to oversaturate its bases with workers, prompting people to question whether the community’s assumptions have been wrong for years and it has found a more efficient method of building an economy.
The project could prove useful for balance and map testing, for finding counters to specific tactics, and for applications we haven’t even considered yet. Similar techniques could lead to huge leaps in the abilities of AI in all kinds of games: NPC companions, opposing sports teams, and cannon fodder mobs alike.
Better Living Through AI
But DeepMind didn’t set out to build AI that was good at games, but to use games to build better AI. The theory is that the flexibility of this approach will be applicable to all manner of different problems.
DeepMind’s website optimistically declares, “From climate change to the need for radically improved healthcare, too many problems suffer from painfully slow progress, their complexity overwhelming our ability to find solutions. With AI as a multiplier for human ingenuity, those solutions will come into reach.” This may be asking a little much from an artificial intelligence company’s About Us page, but there’s no detail about exactly how that is supposed to work.
The overarching concept is that the same scaffolding used to build AI that can kick ass at esports can tackle humanity’s most pressing issues. That’s not an unreasonable hope, but while this is an impressive step forward for AI, it’s also worth keeping in perspective. Go is far more complicated than chess, and StarCraft is exponentially more complex than either, but these are all contained systems with rules that were created by a game designer. Most of the activities that humans deal with outside of a video game, even something as mundane as driving to the store, are considerably harder to quantify. More discouragingly, many of the causes of things like our environmental and healthcare disasters don’t require a complex neural net to understand and no amount of computational power will fix them.
Anyone doing a little more cursory research will find that DeepMind’s investors have included people like Elon Musk, and that the company was once in negotiations with Facebook. Scrolling a bit further down their About Us page will tell you that in 2014 they were acquired by Google and they operate as part of the Alphabet group. As it so often seems to be, the pitch is that if billionaires and giant for-profit corporations just code hard enough, it will solve the very problems that have, in large part, been caused or exacerbated by billionaires and giant for-profit corporations in the first place.
Technology obviously can be used to solve problems and provide mankind with modern conveniences, but the realities of who those technologies tend to most benefit are stark. There’s growing evidence that arguments about new technology creating as many jobs as it displaces no longer apply to modern automation advances. The market for much of the non-robotic workforce is increasingly dire, and given our political and class landscape, the leisurely fifteen hour work week John Maynard Keynes speculated about is nowhere in sight. While the rise of digital automation is hardly the sole cause of the disparity, the gap between stagnant wages and ever-rising productivity has widened tremendously over the last half century. Over the same period, the very, very rich–not coincidentally, the people who benefit from all that human and machine labor, and who have the resources to invest in ambitious artificial intelligence projects-–have grown obscenely richer.
What AlphaStar has accomplished could have a compelling impact on StarCraft, the larger games universe, or even beyond. But viewing that advancement through a vacuum requires ignoring the technological realities of our world. There are the plenty of good reasons to be concerned about sci-fi AI singularity nightmares, but, frankly, there are plenty of good reasons to be concerned about the technology and overlords we have already. At the moment, they control a disturbing amount of our lives, in ways that seem borderline inescapable. Through that lens, the sight of those perfectly coordinated stalkers dismantling the best effort of a talented human player whose skills had been honed over a lifetime, may still look like the future, but it’s not progress.