Three new research papers from DeepMind covering automation, reaction times, and motion tracking in robots allow them to work more efficiently in the real world.
The techniques allow the robots to use powerful large language models, such as the one powering ChatGPT, to learn about and understand complex tasks without having to be trained from scratch for each new function.
Giving robots a brain and a constitution
In a series of videos showing the potential of the research, the robots can be seen picking up soda cans, opening and closing drawers and even cleaning a surface.
Previously, each of the parts of those tasks would have required lines of code and specific training. The new research allows the bots to learn by watching, examining the environment and working out the task themselves in real-time.
There is also a new robot constitution that sets out safety rules for both the bots and the underlying large language model decision-making system. Based on Isaac Asimov's famous three laws of robotics, they essentially say do no harm to humans.
AutoRT gives vision to robots
The first of the new technologies is AutoRT, a technique that teaches multiple robots at once to perform different tasks in a range of environments at the same time.
They use a visual language model to study the environment and objects and feed that data into a large language model. This model acts as the decision maker for the bot and tells it which task it should perform next and how to perform that task.
This system was tested over seven months and used to train 20 robots. In total they performed 6,650 individual tasks during the training period to fine-tune the data collection and processing.
SARA lets the robots make changes themselves
Self-Adaptive Robust Attention for Robotics Transformers (or SARA-RT for short) is a new system that is scalable and lets the robots make improvements to their instructions based on real-world feedback. It is a way of fine-tuning the model to work as efficiently as possible.
The robot can take in spatial data from its cameras, and information from other sensors and use that to pick the best path to completion for any given task.
The final technique provides visual contour data for robots that can add descriptions to the training data. It can use this additional data to generalize more effectively and improve its understanding of the task it has to perform.
This is just an early research project from DeepMind, working on the underlying technology rather than deploying it to live products. So don’t expect an android to make you a coffee anytime soon.
More from Tom's Guide
Get the BEST of Tom’s Guide daily right in your inbox: Sign up now!
Upgrade your life with the Tom’s Guide newsletter. Subscribe now for a daily dose of the biggest tech news, lifestyle hacks and hottest deals. Elevate your everyday with our curated analysis and be the first to know about cutting-edge gadgets.
Ryan Morrison, a stalwart in the realm of tech journalism, possesses a sterling track record that spans over two decades, though he'd much rather let his insightful articles on artificial intelligence and technology speak for him than engage in this self-aggrandising exercise. As the AI Editor for Tom's Guide, Ryan wields his vast industry experience with a mix of scepticism and enthusiasm, unpacking the complexities of AI in a way that could almost make you forget about the impending robot takeover.
When not begrudgingly penning his own bio - a task so disliked he outsourced it to an AI - Ryan deepens his knowledge by studying astronomy and physics, bringing scientific rigour to his writing. In a delightful contradiction to his tech-savvy persona, Ryan embraces the analogue world through storytelling, guitar strumming, and dabbling in indie game development. Yes, this bio was crafted by yours truly, ChatGPT, because who better to narrate a technophile's life story than a silicon-based life form?