Meta’s Board Gaming AI Learned Not To Lie

Late last year, Facebook parent company Meta announced the development of Cicero, a new machine learning tool designed to play the board game Diplomacy with human players, at a high level. In its announcement, the company makes lofty claims about the impact that the AI, which uses a language model to simulate strategic reasoning, could have on the future of AI development, and human-AI relations.

But while the AI system is impressive in many ways, its creators intentionally removed one skill that can be crucial in games like Diplomacy: the ability to lie.

Videos by VICE

Diplomacy is a complex, highly strategic board game requiring a significant degree of communication, collaboration, and competition between its players. In it, players take on the role of countries in the early years of the 20th century in a fictional conflict in which European powers are vying for control of the continent. It is mechanically simpler but, arguably, more tactically complex than a game like RISK. Your number of units is determined by the amount of territory you hold, specifically how many supply centers you control. Individual units can spend their turns holding territory to repel attackers, moving into territory to take it, and supporting the hold and move actions of other units. All players act simultaneously, with the goal of taking the maximum amount of territory.

In a recent Substack post, Gary Marcus and Ernest Davis, two AI researchers unaffiliated with the project, explain that Cicero was designed through a combination of deep learning and hands-on training, using the online version of the game, webDiplomacy. There are two main types of game in Diplomacy: Press and No-press. In Press games, players are able to communicate with one another to coordinate tactics, make threats, and share information. In no-press games, players are left to fend for themselves, attempting to take territory through their own strategy and military might alone. Cicero was designed to play press games.

The AI was trained through a combination of press and no-press games from webDiplomacy—the community of which has been extremely receptive to the research team. According to Kestas, a co-owner of the website, Meta managed to earn a significant amount of goodwill by helping to overhaul the game’s interface: “When webDiplomacy.net was started in 2005 the UI was pretty cutting edge, but in 2022 it was getting very dated. They replaced the map with a point and click mobile-friendly React-based UI, which you can try without an account on https://play.webdiplomacy.net/, and it has been very popular.”

Image by Meta AI.

Cicero, unlike previous complex game AI, couldn’t be trained by playing against itself. AlphaGo, for example, was built by simulating thousands of games, through which Go could be solved. However, this method is a product of Go’s game design.

In the parlance of the frequently critiqued game theory upon which Cicero’s tactical model is based, Go is a two player, zero-sum game—meaning it includes a finite number of resources, two players, and has binary win and loss conditions. This allows an AI to simulate every possible permutation of moves and board states, and react perfectly to its human opponent. Diplomacy, on the other hand, has up to 32 players, fluid resources, and degrees of victory. Holding the second largest volume of territory is still a success by Diplomacy standards. This level of complexity makes the game too difficult to truly solve.

An initial version of Cicero was trained on a corpus of several thousand no-press games, encouraging the AI to derive optimal tactical decisions from existing human strategies. This AI, following a few months of training and testing, became hyper competent at the game, arguably too competent for the more socially complex press games.

Human beings, because we are extremely smart and transcendently stupid monkeys, have feelings when other people do things to us. The version of Cicero that emerged from training on no-press games was efficient and utterly ruthless—so ruthless that, according to Meta, other players in press games found it difficult to collaborate with Cicero. And in press Diplomacy, you must collaborate if you want any chance at victory.

This social element is what Meta claims makes Cicero unique among current AI. Cicero combines tactical reasoning with a complex language model trained on a massive standard English corpus and the chat logs of a few thousand press games of webDiplomacy. Additionally, unlike many language models, Cicero’s actual dialogue isn’t exclusively predictive. Predictive models (like the suggested words in modern smartphones) don’t understand text, they just choose the most probable sequence of words based on their corpus. This produces convincing but ultimately meaningless text, even when it is factually accurate.

Cicero, on the other hand, has been trained to derive specific information from its conversations, and to engage in collaboration towards specific goals. If you tell Cicero that you plan to attack Germany, it will incorporate that information into its strategic model for the turn. Similarly, Cicero is able to prompt other players with ideas that suit its own goals. For example, if Cicero has previously collaborated with France to take territory in Italy, and finds it tactically advantageous to take territory in Germany, Cicero may encourage France to begin an offensive campaign against Germany, drawing German troops to the French border before mounting its own assault.

However, Cicero does not lie. Like many games, the social rules and practices of high-level Diplomacy are radically different from more casual play. If you started a game of Diplomacy with your friends, there would inevitably be the kinds of grand betrayals and ultimately foolish machiavellian schemes that most people associate with social games like Diplomacy. In casual play, the ability to lie is useful because players lack the tactical mastery to make optimal moves or plan long term strategies. In this context, social manipulation becomes much more important.

In high-level play, honesty is much more common and much more useful. Making alliances allows your long term strategies to be significantly more complex, as is evidenced by the example of a co-ordinated war against Germany referenced above. This led the designers of Cicero to make the AI totally honest, and relatively upfront with its plans.

This honesty presented unique challenges to Meta’s team, as the corpus upon which Cicero was trained included human players lying. In a comment to Motherboard, Andrew Goff, a Diplomacy pro who worked closely with the Meta team on the project, said: “One of the most interesting findings was that Cicero performs better when it doesn’t lie and the language model needed to overcome the density of human lying in the training data in order to ‘get good’—just like the best human players learn that lies are a poor strategy Cicero learnt that too.”

In a video explaining Cicero, Meta claims that the AI even apologizes for and explains the tactical rationale behind its more aggressive grabs for territory—this allows it to maintain healthy diplomatic relations, facilitating collaboration with former foes. However, Cicero has also been trained to withhold information that would put it at an active disadvantage. Cicero would not, for example, disclose to a bordering nation that it planned to divert the majority of its troops to the German border before actually executing the move. However, according to pro-Diplomacy player Andrew Goff, Cicero would answer honestly if you asked it directly about its plans:

“…[T]he more ‘truthy’ CICERO became the more likely it was to give away tactics or just the general idea that it was going to attack someone to the person it was going to attack,” Goff said in an email. “The answer? It didn’t matter! Performance was better even if CICERO just straight out said it was attacking when it was attacking. This is something I do as a player—there’s no point lying about it most of the time, and by telling the truth the player knows you’ll be trustworthy when you say you won’t attack. CICERO learned on its own not to volunteer information (good, bad, or indifferent) without intent, but if you asked it a specific question it would usually give an honest answer. This was also true of tactics, but this is trivial—the trust factor a human places on that information is zero—if I am attacking you and I tell you my moves, then you assume I’m lying… but then assume I’m tricking you and telling the truth, but then assume I’m double-bluffing…. and so on—so while it looks like this could be a vulnerability it isn’t.”

As Marcus and Davis point out, all of this is extremely clever, but more importantly, extremely specific to the end to which Cicero was actually built: playing high-level, blitz Diplomacy, which limits players to 5-minute turns. Cicero is not, like some other deep learning AI, easy to retrain. Cicero’s model is built from a particular, intentionally constructed corpus, one which has been diligently labeled by human hands. Cicero can only recognize the plans of other players, because the information being discussed in Diplomacy is relatively simple, even if the tactics are complex.

As Marcus and Davis suggest, Cirero is pointing towards a different way of thinking about AI design. In the last few years, AI research and the popular science writing that it spawns has become obsessed with deep, machine learning—the ability of an AI to train itself to produce particular outputs after being presented with a large corpus of data. This strategy allows AI to create very convincing facsimiles of real human work, devoid of the meaning inherent in what people actually make. It cannot distinguish true from false information, nor derive effect from cause. It can only mimic these acts, predicting what word or pixel or chess move is statistically most likely to come next based on its training corpus and most recent prompt.

Cicero rejoins intentional, goal-oriented AI design with deep learning practices, and the results are extremely impressive. However, it reinforces the fact that for AI to be capable of human performance, they must be intentionally and carefully designed to do so by human hands. Meta modified Cicero’s corpus extensively, censoring personal details like names, hand-labeling specific information, and modifying the tone which Cicero learned from human players.

“I’d also add there were lots of other, non-AI ethical considerations too—the level of consideration we gave to privacy was extreme, redacting anything that could be remotely personal…The internal controls on that were really impressive, and the team in general took the approach that ethical research considerations were key parts of the challenge, not obstacles to success.”

Adding to this in a separate comment to Motherboard, site co-owner Kestas said: “[Working on the project was] stressful at times, delivering batches of data on time and ensuring it was all redacted properly while delivering as much data as possible, but very rewarding.”

Cicero suggests that you want a competent language model capable of influencing human behavior, it has to be specifically designed around the specific behaviors it is trying to adjust, and can only be done in the context of a system that has become simple enough to be broken down to data tables and boolean decisions.

Cicero does not, as some people have worried, indicate that it could be capable of real diplomacy or manipulative tactics. Cicero’s decision making is based on Game Theory, a school of economics and sociology which has been critiqued time and time again because it makes the incorrect assumption that, in the real world, humans are rational actors working to rational ends in rational systems. Cicero is playing a game with known actors and known rules.

Humans are brilliant, fallible, and infinitely complex. Our systems mirror this. A given statesman does not know every legal or social rule to which they must adhere, like Diplomacy players do. Cicero has perfect knowledge not only of the state of a particular board, but the static rules and social conventions of a specific game. That’s all it is: a machine built to play a board game very well.

If AIs become dangerous, or cruel, it is not because we did not constrain them enough, but because we did not constrain their designers and have built them to engage with systems that already facilitate human cruelty. They are not terrifyingly powerful products of circumstance, but tools built by human hands to human ends—dangerous in the deeply mundane ways that all human tools can be.