For a long time, people have dreamed of humanoid robots being as flexible and agile as humans, or even surpassing them. However, achieving full-body coordination and agile movement in robots remains a significant challenge due to the physical differences between simulated environments and the real world. Traditional system identification and domain randomization methods often rely on cumbersome parameter tuning or lead to overly conservative robot movements, sacrificing agility. Now, a new framework called ASAP (Aligning Simulation and Real Physics) has emerged, allowing humanoid robots to master more flexible full-body movement skills by cleverly aligning simulated and real physical characteristics.

image.png

The ASAP framework consists of two key stages. First, in the pre-training stage, researchers use human motion video data to remap these actions onto humanoid robots and then train the robots to learn these movements in a simulated environment. However, directly applying the trained strategies from the simulated environment to real robots often leads to performance degradation due to dynamic differences between the simulation and the real world. To address this issue, the ASAP framework enters the second stage—post-training. In this phase, researchers have the robot execute the pre-trained movements in the real world and record the robot's actual motion trajectories.

Next, the ASAP framework uses this real-world motion data to reproduce the robot's movements in the simulator. Due to differences between the simulated environment and the real world, the simulated motion trajectories often deviate from the actual motion trajectories. This discrepancy provides researchers with a learning signal. ASAP trains a "differential motion model" that learns and compensates for the dynamic differences between simulation and reality. This model acts like a "correction tool," able to adjust the shortcomings of the simulator, making it closer to the physical characteristics of the real world. Finally, researchers integrate this "differential motion model" into the simulator and use it to fine-tune the pre-trained motion tracking strategy, allowing the robot's movements to better adapt to the physical characteristics of the real world. The fine-tuned strategy can be directly deployed onto real-world robots without needing the "differential motion model" anymore.

To validate the effectiveness of the ASAP framework, researchers conducted multiple experiments, including transfer between different simulators and testing on the real humanoid robot Unitree G1. The experimental results show that the ASAP framework significantly improves the robot's agility and full-body coordination in various dynamic movements. Compared to traditional system identification, domain randomization, and dynamic differential learning methods, ASAP significantly reduces motion tracking errors.

The success of the ASAP framework lies in its ability to effectively bridge the dynamic differences between simulated environments and the real world, enabling humanoid robots trained in simulation to truly exhibit high agility in the real world. This points to a new direction for developing more flexible and multifunctional humanoid robots.

Key technologies of the ASAP framework include:

Using human motion data for pre-training: Transforming human agile movements into learning objectives for robots, providing high-quality motion data for the robots.

Training of the differential motion model: Learning the differences between the real world and the simulated environment to dynamically compensate for the simulator's shortcomings, improving simulation accuracy.

Strategy fine-tuning based on the differential motion model: Allowing robot strategies to adapt to the physical characteristics of the real world, ultimately achieving higher motion performance.

Experimental validation of the ASAP framework shows:

In the transfer between simulators, ASAP significantly reduces motion tracking errors, outperforming other benchmark methods.

In tests on real robots, ASAP also significantly enhances the robot's motion performance, enabling it to perform challenging agile movements.

This research also delves into the key factors for training the differential motion model, including dataset size, training duration, and action norm weights. Additionally, researchers compared different strategies for using the differential motion model and ultimately confirmed that reinforcement learning fine-tuning achieves the best performance.

Despite the impressive progress made by the ASAP framework, it still has some limitations, such as hardware constraints, reliance on motion capture systems, and high data demands. Future research directions may include developing strategy architectures that can sense hardware failures, using unmarked pose estimation or onboard sensor fusion to reduce reliance on motion capture systems, and exploring more efficient adaptive techniques for the differential motion model.

The emergence of the ASAP framework brings new hope to the field of humanoid robotics. By cleverly addressing the dynamic differences between simulation and reality, ASAP enables humanoid robots to master more agile and coordinated movement skills, laying a solid foundation for the widespread application of humanoid robots in the real world.

Project address: https://agile.human2humanoid.com/

Paper address: https://arxiv.org/pdf/2502.01143