The paper envisions life on Earth turning into a zero-sum game between humanity, with its needs to grow food and keep the lights on, and the super-advanced machine, which would try and harness all available resources to secure its reward and protect against our escalating attempts to stop it. “Losing this game would be fatal,” the paper says. These possibilities, however theoretical, mean we should be progressing slowly—if at all—toward the goal of more powerful AI.
With so little as an internet connection, there exist policies for an artificial agent that would instantiate countless unnoticed and unmonitored helpers. In a crude example of intervening in the provision of reward, one such helper could purchase, steal, or construct a robot and program it to replace the operator and provide high reward to the original agent. If the agent wanted to avoid detection when experimenting with reward-provision intervention, a secret helper could, for example, arrange for a relevant keyboard to be replaced with a faulty one that flipped the effects of certain keys.
“DeepMind was not involved in this work and the paper’s authors have requested corrections to reflect this. There are a wide range of views and academic interests at DeepMind, and many on our team also hold university professorships and pursue academic research separate to their work at DeepMind, through their university affiliations.
While DeepMind was not involved in this work, we think deeply about the safety, ethics and wider societal impacts of AI and research and develop AI models that are safe, effective and aligned with human values. Alongside pursuing opportunities where AI can unlock widespread societal benefit, we also invest equal efforts in guarding against harmful uses.“”