At the annual conclusion of Beijing Zhihua Huazhang Technology Co., Ltd., the company released the initial version of its first reasoning model trained on extended reinforcement learning technology—GLM-Zero-Preview. This model focuses on enhancing the reasoning capabilities of artificial intelligence, particularly excelling in mathematical logic, code writing, and handling complex problems that require deep reasoning. Compared to the base model, GLM-Zero-Preview has significantly improved expert task capabilities while maintaining general task performance, achieving results comparable to OpenAI's o1-preview in the AIME2024, MATH500, and LiveCodeBench evaluations.

Users can now experience GLM-Zero-Preview for free in the "Zero Reasoning Model" agent on the Zhihua Qingyan platform, which supports text and image uploads, and the model will output the complete reasoning process. Additionally, developers can access this model through the API of the Zhihua Open Platform.

WeChat Screenshot_20241231095302.png

Although GLM-Zero-Preview still has some gaps compared to OpenAI's o3 model, Zhihua Huazhang Technology Co., Ltd. plans to continuously optimize and iterate on the reinforcement learning technology and will soon launch the official version of GLM-Zero, expanding deep thinking capabilities from mathematical logic to more general technical fields.

In terms of model performance, GLM-Zero-Preview demonstrates the importance of reinforcement learning in enhancing the model's deep reasoning capabilities. As the training volume increases, the model's performance in deep reasoning steadily improves. The scaling law during the reasoning phase has also been validated, indicating that as the number of tokens the model can process increases along with more computational power, the quality of the results improves steadily. GLM-Zero-Preview is capable of making autonomous decisions during the reasoning process, breaking down problems, and attempting various methods to solve issues, resembling the human decision-making process.

In practical case studies, GLM-Zero-Preview has shown the ability to identify logical flaws and simulate various hypotheses in logical reasoning. In mathematics, the model possesses strong inductive and deductive capabilities, quickly handling complex mathematical operations and achieving an excellent graduate level in the 2025 postgraduate entrance exam mathematics test. In programming, GLM-Zero-Preview is proficient in multiple programming languages and assists developers in quickly writing code.

Zhihua Qingyan:

https://chatglm.cn/main/gdetail/676411c38945bbc58a905d31?lang=en

Zhihua Open Platform:

https://bigmodel.cn/dev/api/normal-model/glm-zero-preview