This week, the Massachusetts Institute of Technology (MIT) showcased a novel robot training model that abandons the traditional approach of focusing on specific datasets, instead utilizing vast amounts of information similar to those used in training large language models (LLMs).
Researchers have pointed out that imitation learning—where an agent learns by mimicking individuals performing tasks—may fail when faced with minor challenges. These challenges could include varying lighting conditions, different environmental settings, or new obstacles. In such situations, robots lack sufficient data to adapt to these changes.
Image source: Picture generated by AI, provided by Midjourney, an image licensing service.
The team drew inspiration from models like GPT-4, adopting a brute-force, data-driven approach to problem-solving.
"In the field of language, data is represented by sentences," said Lirui Wang, the lead author of the paper. "In robotics, given the diversity of data, if you want to pre-train in a similar manner, we need a different architecture."
The team introduced a new architecture called the Heterogeneous Pre-trained Transformer (HPT), which integrates information from different sensors and environments. The data is then incorporated into the training model using transformers. The larger the transformer, the better the output results.
Users subsequently input the design, configuration of the robot, and the tasks they wish to accomplish.
"Our dream is to have a universal robot brain that you can download and use for your robot without any training," said David Held, an associate professor at Carnegie Mellon University, speaking about the research. "Although we are just beginning, we will continue to strive, hoping that the scaling up will bring breakthroughs to robot strategies, just as it has done with large language models."
This research was partially funded by the Toyota Research Institute. Last year at TechCrunch Disrupt, TRI demonstrated a method to train robots overnight. Recently, it achieved a watershed partnership, combining its robot learning research with Boston Dynamics' hardware.