AI machines can solve complex equations faster than we can write them, beat chess grandmasters at their own game, and to some extent even recognise human emotions. But they still struggle to reason and make sense of new information, which leads to ongoing questions over how to define what's really artificially intelligent.
Gary Marcus, a professor of cognitive science at New York University, is the co-chairman of a group that proposes not just one test of artificial intelligence, but a whole series. He wants to call it the "Turing Championship."
If you want to see how smart a machine is at the moment, your first stop would probably be to give it a Turing test. Alan Turing was a British mathematician, computer scientist, code breaker and (just to make you feel bad) marathon runner. He introduced his test to determine if a machine could display human-like intelligence in 1950, based on the premise that a pass-worthy machine would be able to fool a human it was a real person. But it's increasingly recognised that the 60-year-old test needs a change.
"Machines are now able to "beat" the test, under somewhat dubious circumstances, but it's clear that it is a kind of meaningless victory," said Marcus, who is among those who think a revision is in order. Turing's interest was in creating a genuinely smart machine, but it's been shown you don't necessarily need real intelligence to pass his test.
Most recently, a program known as Eugene Goostman passed—not through brains, but cunning. Eugene Goostman fooled 33 percent of judges (30 percent was the pass rate) in a series of short, five-minute interactions. By pretending he was young and not a native English speaker, he managed to fool people into believing he was human. "The programmer found a loophole in Turing's sixty-year-old rules, not a solution to AI," Marcus said. "It's time to close the loophole."
I asked Marcus what features the new tests would have and he told me the Turing Championship "would have different events geared towards different aspects of language and thought." One idea Marcus suggested in a column he wrote for the New Yorker was a test on "genuinely comprehending new information, like movies or books that haven't already been summarized on the web." Marcus believes if a machine could watch a video and answer questions about the content–like 'why were people protesting?'–that would give a better indication of its intelligence.
If we rely on any single test it's easier to exploit the rules. But if a program is subjected to a number of them, requiring different forms of "intelligence," it will be harder to win through deception.
A further criticism of the Turing test is that it relies on people's interactions with machinery. "I don't think the Turing Test was ever a good indicator for artificial intelligence," Marcus told me. "As far back as the 1960s, when the program Eliza came on the scene, it was clear that people could be fooled fairly easily."
If you've ever watched a magician perform, you'll know how easy it is to be duped. And misleading people doesn't signify progress in artificial intelligence. "In my opinion, the test stuck around longer than it should have, and plenty of people have been saying that for years," Marcus said.
He thinks a group of tests would work better because of how hard it is to define intelligence. "It's not a one-dimensional variable; there are lots of things that go into it. I don't think we want a single number (like an IQ test)." Marcus wants to "lead machines to have a deeper understanding of language and the world than is currently possible."
He and his team, which includes Manuela Veloso, president of the Association for the Advancement of Artificial Intelligence (AAAI), and Francesca Rossi, president of the International Joint Conferences on Artificial Intelligence (IJCAI) board of trustees, haven't yet decided how the tests will be conducted and are holding a workshop in January to discuss the rules.