Google's DeepMind research lab has recently launched the AI system AlphaGeometry2, which excels at solving geometric problems, surpassing the average gold medal winners in the International Mathematical Olympiad (IMO). This system is considered an improved version of AlphaGeometry, and researchers indicate that AlphaGeometry2 can solve 84% of the geometric problems from the IMO over the past 25 years.

image.png

Why is DeepMind focusing on high school math competitions? They believe that finding new ways to solve complex geometric problems, especially in Euclidean geometry, could be key to enhancing AI capabilities. Proving mathematical theorems or explaining why theorems (like the Pythagorean theorem) hold requires logical reasoning and the ability to choose from multiple possible steps. If DeepMind's theory holds, these problem-solving abilities will be crucial for the future of general AI models.

This summer, DeepMind showcased a system that combines AlphaGeometry2 with the mathematical reasoning AI model AlphaProof, which solved four out of six problems in the 2024 IMO. Besides geometric problems, this approach can be extended to other areas of mathematics and science, such as complex engineering calculations.

The core components of AlphaGeometry2 include a language model from Google's Gemini series and a "symbol engine." The Gemini model assists the symbol engine in deriving feasible solutions to problems through mathematical rules. IMO's geometric problems are often based on figures that require the addition of "constructions," such as points, lines, or circles. The Gemini model of AlphaGeometry2 can predict which constructions might be helpful in solving a problem.

It is noteworthy that AlphaGeometry2 was trained using more than 300 million synthesized theorems and proofs generated by DeepMind itself when solving IMO problems. The research team selected 45 geometric problems from the past 25 years of IMO and expanded them to create a set of 50 problems. AlphaGeometry2 successfully solved 42 of these, surpassing the average score of gold medal winners.

However, AlphaGeometry2 still has some limitations, such as its inability to solve problems with a variable number of points, nonlinear equations, and inequalities. Nevertheless, this research has sparked discussions about whether AI systems should be based on symbolic operations or neural networks. AlphaGeometry2 employs a hybrid approach, combining neural networks with a rule-based symbol engine.

The success of AlphaGeometry2 offers new directions for the future development of general AI. Although it is not yet fully self-sufficient, the research by the DeepMind team suggests that more self-sufficient AI models may emerge in the future.

Paper link: https://arxiv.org/pdf/2502.03544

Key Points:

📊 AlphaGeometry2 can solve 84% of the geometric problems from the IMO over the past 25 years, surpassing the average score of gold medal winners.  

🔍 The system combines neural networks and a symbol engine, using a hybrid approach to tackle complex mathematical problems.  

📈 DeepMind aims to advance the research of more powerful general AI by solving geometric problems.