The Plan to Build a Massive Online Brain for All the World's Robots

If you walk into the computer science building at Stanford University, Mobi is standing in the lobby, encased in glass. He looks a bit like a garbage can, with a rod for a neck and a camera for eyes. He was one of several robots developed at Stanford in the 1980s to study how machines […]
kidbrainsinline
BrainGetty

If you walk into the computer science building at Stanford University, Mobi is standing in the lobby, encased in glass. He looks a bit like a garbage can, with a rod for a neck and a camera for eyes. He was one of several robots developed at Stanford in the 1980s to study how machines might learn to navigate their environment---a stepping stone toward intelligent robots that could live and work alongside humans. He worked, but not especially well. The best he could do was follow a path along a wall. Like so many other robots, his "brain" was on the small side.

Now, just down the hall from Mobi, scientists led by roboticist Ashutosh Saxena are taking this mission several steps further. They're working to build machines that can see, hear, comprehend natural language (both written and spoken), and develop an understanding of the world around them, in much the same way that people do.

Today, backed by funding from the National Science Foundation, the Office of Naval Research, Google, Microsoft, and Qualcomm, Saxena and his team unveiled what they call RoboBrain, a kind of online service packed with information and artificial intelligence software that any robot could tap into. Working alongside researchers at the University of California at Berkeley, Brown University, and Cornell University, they hope to create a massive online "brain" that can help all robots navigate and even understand the world around them. "The purpose," says Saxena, who dreamed it all up, "is to build a very good knowledge graph---or a knowledge base---for robots to use."

The core team behind the RoboBrain Project (from left): Aditya Jami, Kevin Lee, Prof. Ashutosh Saxena, Ashesh Jain, Ozan Sener and Chenxia Wu.

Ashutosh Saxena

Any researcher anywhere will be able use the service wirelessly, for free, and transplant its knowledge to local robots. These robots, in turn, will feed what they learn back into the service, improving RoboBrain’s know-how. Then the cycle repeats.

These days, if you want a robot to serve coffee or carry packages across a room, you have to hand-code a new software program---or ask a fellow roboticist to share code that's already been built. If you want to teach a robot a new task, you start all over. These programs, or apps, live on the robot itself, and that, Saxena says, is inefficient. It goes against all the current trends in tech and artificial intelligence, which seek to exploit the power of distributed systems, massive clusters of computers that can power devices over the net. But this is starting to change. RoboBrain is part of an emerging movement known as cloud robotics.

The Dawn of Cloud Robotics

The concept was popularized in 2010 by Google's James Kuffner, one of the engineers behind the tech giant's self-driving cars. In the years since, the idea has slowly spread.

In 2011, the European Union's research arm, Seventh Framework Programme, launched RoboEarth, an initiative that lets robots "share knowledge" via a world-wide-web-style database and "access powerful robotic cloud services," according to the project's website. The source code is available online, and the team already has made strides building a kind of remote brain. Then, last year, Kuffner and Ken Goldberg, a RoboBrainer at Berkeley, published a paper describing a robotic grasping system powered by Google’s object recognition engine and other data sources.

There's also the DAvinCi Project, which aims to supercharge service robots using the popular distributed computing software Hadoop, a way of crunching vast amounts of data across hundreds of machines. And in October, the IEEE Robotics & Automation Society made a call for papers for a special issue on cloud robotics in response to increased interest in this field from researchers, companies, and governments.

Similar ideas can be traced back to a man named Masayuki Inaba. In the '90s, he envisioned robots that would move through the physical world but tap the power of supercomputers across the internet. Back then, we didn't have computing infrastructure to make this possible. Today, tech companies have ready access to enormous amounts of computing power. Startups and universities can get Hadoop and other distributed-software from companies like Cloudera or run it on cloud services from the likes of Amazon. The Amazon cloud is where RoboBrain lives.

The Big Data Problem

Still, hurdles remain. Unlike technologies like Apple's Siri voice assistant or Google's speech recognition or image-tagging systems, robots must juggle many types of data from many sources. Like humans, they're "multi-modal systems," and this creates unique challenges. "The first challenge is how do we come up with a storage layer that will support different modalities of data," says Aditya Jami, RoboBrain's lead infrastructure engineer.

This is what RoboBrain seeks to create. Building the right online storage system, Jami says, is a crucial step to integrating the 100,000 data sources and various types of supervised and unsupervised machine learning algorithms the researchers hope to merge into one huge online network.

A sample brain graph.

RoboBrain Project

Jami---who previously built large-scale computing systems at Netflix and was part of the Yahoo team that spawned various big-data systems, such as Hadoop---says he is developing a storage layer that can merge separate learning models. A deep neural network that lets robots "see" things or grasp objects, for instance, can dovetail with another system that examines the relationship between different types of objects.

Today, he says, things don't always work this way. Disparate AI systems are often developed independently, and don't use standard data formats. (Though this too is starting to change thanks to deep learning, a form of artificial intelligence that seeks to mimic how the brain works. Part of the great promise of deep learning, experts say, is the emergence of a common language and formulas for speech, vision and natural language processing.)

Jami's ambition is for RoboBrain to become a platform like Hadoop---a de facto standard everyone can use and contribute to. Having this sort of common language, he says, will speed up the development of robotics algorithms, spur collaboration, and help usher in the age of multi-modal artificial intelligence.

Common Sense Knowledge

"Any intelligent agent in the real world needs to do three tasks: perception, planning, and language," Saxena says. That's why RoboBrain feeds off an object detection system; PlanIt, a simulation through which users can teach robots how to grasp objects or move about a room; and a system called Tell Me Dave, a crowd-sourced project that teaches robots how to understand language.

Soon, researchers will be adding other types of learning models and data sources, such ImageNet, 3D Warehouse, and YouTube videos. And the knowledge people---and robots---provide RoboBrain will feed back into the models that make it up, helping to tune these interconnected AI systems. He and the team already are testing RoboBrain with a handful of robots, with good results.

By merging all this software and data, the researchers hope to create a system that demonstrates a primitive sense of perception, that can "discover most of the common sense knowledge of the world," says Bart Selman, a RoboBrain collaborator at Cornell.

Right now, context is not something computers are good at deciphering. Unlike humans, robots don't know to move out of the way if people are in their path. That's why there's so much worry about robots causing accidents in homes and industrial settings. People like Selman are still a long way from changing this. But they're making progress. Mobi looks more quaint by the day.