Google’s DeepMind Training Robotics With Video and Lang Models

February 8, 2024

355

In 2024, Google’s DeepMind Robotics researchers are among many teams who are exploring the potential of generative AI/large foundational models and robotics for various applications, such as learning and product design. There is a great deal of anticipation surrounding the possibilities of training robotics with DeepMind.

Today, the team is emphasizing their research on giving training robotics a better understanding of what humans expect from them. Instead of just repeating the same task over and over, robots need to be able to recognize and react to changes in their environment or mission parameters. This kind of adaptability would allow robots to be used in more dynamic situations, such as those encountered in a factory, a hospital, or even a home.

The DeepMind team designed AutoRT to harness large foundational models for several different ends. As an example, the system uses a Visual Language Model (VLM) for improved situational awareness. AutoRT also enables a fleet of robots to work in tandem and use cameras to map out their environment and identify objects.

The hardware can accomplish tasks suggested by a large language model (LLM), which is widely believed to be the key to enabling robots to understand more natural language commands, eliminating the need for hard coding skills. AutoRT has been extensively tested over the past seven months and can manage up to 20 robots and 52 devices simultaneously.

DeepMind has conducted 77,000 trials and completed over 6,000 tasks. Additionally, they have developed RT-Trajectory which uses video input to teach robots. Many teams are using YouTube videos to train robots on a large scale, but RT-Trajectory adds a two-dimensional sketch of the arm in action over the video.

We note that the trajectories, represented as RGB images, provide practical visual cues to the model as it learns the robot-control policies. DeepMind reports that their RT-Trajectory training had double the success rate of RT-2 training, achieving 63% success on 41 tasks compared to 29%. They emphasize that RT-Trajectory takes advantage of the abundant data from robotic motion that is currently not being utilized.

RT-Trajectory takes another step in the journey to construct robots that can move with efficient accuracy in new scenarios, while also unlocking knowledge from existing datasets.

Google’s DeepMind Training Robotics With Video and Lang Models

YouTube Kills Trending Page—Replaces It with Category-Based Charts

Samsung One UI 8 Brings Major Security Upgrades with KEEP, Knox Matrix, and Quantum WiFi

Microsoft Replaces Windows Blue Screen of Death with Black Screen

Most Popular

Apple Delays M5 MacBook Pro Release to 2026—Here’s What to Expect

Galaxy Watch 8 Drops Band Support—Old Straps Won’t Fit

YouTube Kills Trending Page—Replaces It with Category-Based Charts

Samsung Galaxy Z Fold7, Flip7, and Flip7 FE Launch with Major Upgrades

EDITOR PICKS

11 Worst PC Building Mistakes That Cost First-Time Builders Big Money

OpenAI Offers Free ChatGPT Plus to College Students in Bold AI Battle

Meta’s Bold Move: The End of Smartphones and the Rise of META Smart Glasses

POPULAR POSTS

Apple Delays M5 MacBook Pro Release to 2026—Here’s What to Expect

Galaxy Watch 8 Drops Band Support—Old Straps Won’t Fit

YouTube Kills Trending Page—Replaces It with Category-Based Charts

POPULAR CATEGORY

ABOUT US

FOLLOW US