This Robot Can Learn a Task After Seeing You Do It Once In VR
Screengrab: OpenAI


This story is over 5 years old.

This Robot Can Learn a Task After Seeing You Do It Once In VR

One day you'll teach a robot to cook by demonstration.

Training a robot to do a job is a lot harder than training a person. Humans can learn simply by being shown how to do something, often just once, but machines can require days of training on vast amounts of data before they're able to accomplish even simple tasks, like recognizing a cat in a photo.

Researchers at the Elon Musk-backed OpenAI, a non-profit research institute in San Francisco, have done something pretty awesome: they've figured out how to train a robot to perform a particular task after watching a demonstration by a person—once. They call it "one-shot imitation learning," and it's incredible to watch unfold in a video the institute posted on Tuesday.


It works like this: in virtual reality, a human stacks some digital blocks from the perspective of a robot, like a video game. That information is sent to a real robot, which has a camera that looks down at a set of real blocks. After learning from the VR demonstration, it can stack blocks even if they're positioned differently than in the demonstration. It's not simply pantographic repetition; it's imitation, and imitation that can be used in new situations to boot.

"If you have a robot at home setting tables, the cups or cupboards might look different in different homes," said Peter Welinder, a robotics researcher at OpenAI, over the phone. "So, you need algorithms that can understand the intent of the task that humans are doing."

Screengrab: OpenAI

Stacking blocks is a simple test case, but moves us one step closer to creating robots that can handle situations in the real world without extensive training. Imagine, for example, teaching your family's robot how to cook a meal by preparing one yourself.

One-shot imitation learning seems a bit like magic, but there's a lot going on under the hood. Two systems are at play: one neural network (an assemblage of digital nodes that reorganize themselves to accomplish a task) to process real-world images, and another to process imitation data. The vision network was trained like most others, but things get really funky at the imitation stage. In a paper posted to the arXiv preprint server, the researchers describe how they trained the network with pairs of demonstrations and simulations of robotic movement. With that base layer of training, the robot has a pretty good idea of what to do when it sees a new demonstration in VR.


Read More: Robots Can Quickly Learn New Skills From Humans and Teach Them to Other Robots

Getting a robot to do something by demonstrating it in VR has been done before, by Silicon Valley-based 219 Design, but that was repetition, and this is imitation. It's the difference between tracing a drawing and taking an art class.

Training in VR also makes robotics and neural networks more accessible to the average person, Welinder said, which is important if we're imagining advanced robots interacting with normal people, as they'll increasingly do in the future.

"In the past, if you wanted to show a robot how to do something, you often had to do it by directly programming it, or going through some 2D interface," Welinder said. "At the end of the day, it's much more intuitive to enter virtual reality and carry out the task as if you were the robot."

"We can generalize that to many more tasks beyond block stacking, perhaps things that people can do in their own homes," he continued.

Here's hoping they build one of these things for doing laundry.

Subscribe to Science Solved It , Motherboard's new show about the greatest mysteries that were solved by science.