Elon Musk’s nonprofit AI research group, OpenAI, has broken new ground in teaching a robot hand to manipulate objects, according to The Verge.

For one, the researchers accomplished their goal of having the hand rotate a colored block so that certain sides faced upward – but the real accomplishment was in how they taught the hand to perform the task.

As an alternative to expensive programming for each individual task performed by AI, training is normally accomplished over time through trial and error and reinforcement, with software facilitating the “learning” process – but this is a time-consuming process for real-life tasks. When training for something like a video game, the process can be automatically sped up, but for physical tasks, years worth of experience are often required. And physical training often requires a human present to assist in some situations, making it an expensive process as well.

Researchers have suggested that they could instead simulate this training on multiple computers simultaneously, to accomplish the same result in days or even hours.

UC Berkeley robotics professor Ken Goldberg, who reviewed the work, said it was “an important result” in moving toward more economical AI training, noting:

“That’s the beauty of having lots of computers crunching on this. You don’t need any robots. You just have lots of simulation.”

OpenAI simulated the environment, but also added variables to the simulation in order to better reflect the real-world task. They added visual background “noise,” randomized the colors of the hand and the cube, the texture of the cube’s surfaces, and its weight. The idea was to train the AI to handle the unexpected variables that arise in real life situations. This helped the AI system to cross the “reality gap” from the simulation.

Since the robot hand’s base would be at a different angle during each rotation, and a lower angle would mean the hand would drop the cube more easily, the team even randomized the angle of gravity in the simulation, forcing the system to adapt.

“Without this randomization, it would just drop the object all the time because it wasn’t used to it,” according to Matthias Plappert, a member of the OpenAI research team.

The AI system ultimately accumulated about 100 years worth of experience in the simulation, which required a vast amount of computing power to allow it to move through many simulations simultaneously, including 6,144 CPUS and eight Nvidia V100 GPUs.

In the end, the system moved the cube position as many as 50 times in a row without dropping it.

According to Plappert:

“This shows that what we humans do for manipulation is very optimized. It’s a very interesting moment when you look at a robot trying to solve a problem and you think ‘Oh, hey, that’s how I would do that, too.’”

Many of the algorithms and techniques used to teach this system were first developed for training OpenAI’s video game AI. Significantly, the company has suggested that this success indicates that general purpose algorithms can be useful in training AI for a variety of tasks.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.