Containing a Superintelligent AI Is Theoretically Impossible
The halting problem strikes again.
Image: Flickr/Victory of the People
Machines that "learn" and make decisions on their own are proliferating in our daily lives via social networks and smartphones, and experts are already thinking about how we can engineer them so that they don't go rogue.
So far, suggestions have ranged from "training" self-learning machines to ignore certain kinds of information that might teach them racism or sexism, to coding them with values like empathy and respect. But according to some new work from researchers at the Universidad Autónoma de Madrid,as well as other schools in Spain, the US, and Australia, once an AI becomes "superintelligent"—think Ex Machina—it will be impossible to contain it.
Well, the researchers use the word "incomputable" in their paper, posted on the ArXiv preprint server, which in the world of theoretical computer science is perhaps even more damning. The crux of the matter is the "halting problem" devised by Alan Turing, which holds that no algorithm is able to correctly predict whether another algorithm will run forever or whether it will eventually halt—that is, stop running.
Imagine a superintelligent AI with a program that contains every other program in existence. The researchers provided a logical proof that if such an AI could be contained, then the halting problem would by definition be solved. To contain that AI, the argument is that you'd have to simulate it first, but it already simulates everything else, and so we arrive at a paradox.
"It would not be feasible to make sure that [an AI] won't ever cause harm to humans"
"This is a standard methodology in theoretical computer science," Manuel Alfonseca, a professor at the Universidad Autónoma de Madrid'sEscuela Politécnica Superior and lead author of the work, wrote me in an email. "You prove that a problem can be reduced to another one, and then what you know about the second problem can be applied to the first."
Basically, since the halting problem is, in theoretical computer science lingo, "undecidable," then containment of an AI is computationally impossible.
Of course, the very idea of a superintelligent AI capable of simulating the entire state of the world is thought by some to be merely theoretical. Despite warnings from technology luminaries and philosophers, many computer scientists believe that superintelligence is more science fiction than science.
The halting problem has been used to think through other tricky issues relating to the future of advanced artificial intelligence. In 2014, a team of researchers concluded that a similar problem existed for the ethics of machines: it is theoretically impossible to design an algorithm that can correctly predict whether another algorithm will act "morally."
But, again, this is all theory, not a guarantee about the future material state of the world. The solution to the moral aspect of the halting problem might be, as Motherboard editor Michael Byrne wrote, to prove with a degree of certainty through tests that a given AI will act morally. After all, just because one algorithm can't predict the behaviour of another, doesn't mean that we can't make the first algorithm behave properly to begin with.
Alfonseca isn't so sure about a similar solution to the containment problem.
"I believe that if [superintelligent] AI were possible, it would not be feasible to make sure that it won't ever cause harm to humans," he wrote.
Let's just hope we never get there, then.