Alibaba's "Tongyi Qianwen" team has made headlines again! They've just released the Qwen2Math Demo, a mathematical model that's nothing short of a beast, even outperforming GPT-4.
This model not only handles mathematical problems from text inputs but can also interpret formulas from images and screenshots. Imagine taking a photo of a mathematical equation and getting an answer instantly—it's like having a cheat sheet for math class! (Of course, we don't advocate cheating.)
Qwen2-Math is available in three versions: 72B, 7B, and 1.5B. The 72B version is a mathematical prodigy, scoring 7 points higher than GPT-4 on the MATH dataset, an improvement of 9.6%. It's like scoring 145 in the math section of the college entrance exam while your peer genius scores 132.
What's more impressive is that the 7B version, with less than one-tenth the parameters, outperforms the open-source mathematical model NuminaMath. NuminaMath, which won an award at the inaugural AIMO, was honored by none other than the "top guru" of mathematics, Terence Tao.
Lin Junyang, a senior algorithm expert at Alibaba, excitedly announced that they've turned the Qwen2 model into a math whiz. How? They used a special "math brain booster"—a meticulously designed mathematical corpus. This "booster" includes a vast array of high-quality mathematical web texts, books, codes, exam questions, and even math problems "created" by the Qwen2 model itself.
The result? On classic mathematical test sets like GSM8K and MATH, Qwen2-Math-72B leaves the 405B Llama-3.1 in the dust. These test sets are no joke, featuring various tough math problems from algebra to number theory.
Furthermore, Qwen2-Math has challenged the Chinese dataset CMATH and the college entrance exam questions. On the Chinese dataset, even the 1.5B version outperforms the 70B Llama3.1. And regardless of the version, all show significant improvements over the same-scale Qwen2 base model.
It seems "Tongyi Qianwen" has truly unearthed a mathematical genius! Can we now rely on it for math problems? Remember, though, it's just a tool; don't let its intelligence blind you to the importance of building your own mathematical skills!
Online experience link: https://huggingface.co/spaces/Qwen/Qwen2-Math-Demo