Large language models excel in many tasks, but their reasoning capabilities have long been a subject of debate. Meta's researchers recently published a paper demonstrating how they utilized Transformer models to address a longstanding challenge in the field of mathematics: discovering the global Lyapunov function for dynamic systems.
The Lyapunov function can determine the stability of a dynamic system. For example, it can be used to predict the long-term stability of the three-body problem, which involves the long-term trajectories of three celestial bodies under gravitational forces. However, a universal method for deriving Lyapunov functions has yet to be found, with only a few systems known to have corresponding functions.
To tackle this issue, Meta's researchers trained a sequence-to-sequence Transformer model to predict the Lyapunov function for a given system. They innovatively used a "reverse generation" method to create a large dataset of stable dynamic systems and their corresponding Lyapunov functions.
The traditional "forward generation" method starts with randomly generated systems and attempts to compute their Lyapunov functions, which is inefficient and only handles specific types of simple systems. In contrast, the "reverse generation" method first randomly generates Lyapunov functions and then constructs corresponding stable systems, bypassing the challenge of computing Lyapunov functions and allowing for more diverse training data.
The researchers found that the Transformer model trained on the "reverse generation" dataset achieved near-perfect accuracy (99%) on the test set and performed well (73%) on out-of-distribution test sets. More surprisingly, by adding a small number (300) of "forward generation" simple examples to the training set, the model's accuracy could be further improved to 84%, indicating that even a small number of known solutions can significantly enhance the model's generalization capabilities.
To test the model's ability to discover new Lyapunov functions, the researchers generated tens of thousands of random systems and used the model for predictions. The results showed that the model's success rate in finding Lyapunov functions for polynomial systems was ten times that of the current state-of-the-art methods, and it could also discover Lyapunov functions for non-polynomial systems, which no algorithm has been able to do so far.
The researchers also compared the model to human mathematicians. They invited 25 master's degree students in mathematics to participate in a test, and the results showed that the model's accuracy was significantly higher than that of humans.
This study demonstrates that Transformer models can be trained to solve complex mathematical reasoning problems, and the "reverse generation" method can effectively create training datasets, thereby breaking through the limitations of traditional methods. In the future, researchers plan to apply this method to other mathematical challenges and explore more possibilities for AI in scientific discovery.