The highly anticipated Chinese AI company, Zhipu AI (hereinafter referred to as "Zhipu"), recently announced the open-sourcing of its new generation of GLM large language models.
This open-sourcing effort is unprecedented, encompassing not only models with 32B and 9B parameters but also base models, inference models, and rumination models representing future exploration directions. All open-sourced models are under the permissive MIT license, granting developers significant freedom and commercial application possibilities.
Simultaneously, this model series is available for free trial via Zhipu's new platform, Z.ai, and is also available on the Zhipu MaaS platform (bigmodel.cn).
Open-Sourcing Empowerment: Technological Inclusiveness and Accelerated Innovation
The most striking aspect of Zhipu's open-sourced GLM model series is its open approach. All models utilize the MIT license, allowing free use for commercial purposes and free distribution. This means developers can more easily access and use advanced large language model technology without worrying about licensing issues, significantly lowering the barrier to entry for AI applications and potentially accelerating the intelligent transformation of various industries.
Zhipu has open-sourced models in two sizes: 9B and 32B parameters, each including base, inference, and rumination models. Models of different sizes cater to developers' needs in various resource and application scenarios, offering greater flexibility.
Performance Leap: Small Parameters, Big Power
One of the core highlights of this release is the outstanding performance of the 32B parameter inference model, GLM-Z1-32B-0414. According to official data, this model's performance in some tasks is comparable to top-tier models like DeepSeek-R1, which has 671B parameters. Even more impressive is its measured inference speed of 200 Tokens/second (on the MaaS platform bigmodel.cn), making it arguably the fastest commercial model in China. Furthermore, its price is only 1/30th that of DeepSeek-R1, demonstrating exceptional value for money.
Regarding the base model, GLM-4-32B-0414 boasts 32 billion parameters, and its performance is also comparable to mainstream models with larger parameter counts from both China and abroad. This model was pre-trained using 15T high-quality data, incorporating a wealth of reasoning-based synthetic data, laying a solid foundation for subsequent reinforcement learning. In the post-training phase, the model further leverages techniques such as human preference alignment, rejection sampling, and reinforcement learning to significantly improve its key capabilities in instruction following, engineering code generation, and function calling for agent tasks.
In practical applications, GLM-4-32B-0414 excels in engineering code, artifact generation, function calling, search Q&A, and report writing, with some benchmark indicators even approaching or exceeding those of larger models such as GPT-4o and DeepSeek-V3-0324 (671B). It's worth noting that the Z.ai platform's conversational mode includes a preview function that supports visualizing generated HTML and SVG, facilitating user evaluation and iterative optimization.
The inference model, GLM-Z1-32B-0414, builds upon GLM-4-32B-0414, employing a cold-start and expanded reinforcement learning strategy and undergoing deep optimization training for key tasks such as mathematics, code, and logic.
Consequently, its mathematical capabilities and complex problem-solving abilities have been significantly enhanced. Evaluations in benchmarks such as AIME24/25, LiveCodeBench, and GPQA demonstrate GLM-Z1-32B-0414's strong mathematical reasoning capabilities, enabling it to handle a wider range of complex tasks.
Surprisingly, Zhipu also launched a 9B parameter inference model, GLM-Z1-9B-0414. Despite its smaller parameter count, thanks to the same technology and training methods, this model still performs exceptionally well in mathematical reasoning and general tasks, placing it among the leading open-source models of its size. This provides a powerful option for users who need lightweight deployment in resource-constrained scenarios.
Cutting-Edge Exploration: Rumination Models Leading the Future
Another highlight of this release is the rumination model, GLM-Z1-Rumination-32B-0414. Zhipu positions it as the next step in exploring the future form of AGI. Unlike typical inference models, rumination models solve highly open and complex problems through deeper, multi-step thinking.
Its key innovation lies in its ability to integrate search tools to handle complex tasks during deep thinking and utilize various rule-based reward mechanisms to guide and expand end-to-end reinforcement learning training. This model supports a complete research loop of "autonomously raising questions—searching for information—constructing analyses—completing tasks," significantly improving its capabilities in research writing and complex retrieval tasks. Users can now experience its powerful in-depth research capabilities through the Z.ai platform.
New Platform and API Services: Convenient and Accessible
To facilitate user experience and utilization of these new models, Zhipu has launched the new domain Z.ai. This platform integrates 32B base, inference, and rumination GLM models, serving as the interactive experience gateway for Zhipu's latest models.
In addition to the free experience platform, the Zhipu MaaS open platform (bigmodel.cn) also simultaneously launched API services for base and inference models, providing support for enterprises and developers. The launched base models offer GLM-4-Air-250414 (free) and GLM-4-Flash-250414. Inference models offer GLM-Z1-AirX (high-speed version, 200 tokens/second), GLM-Z1-Air (cost-effective version, priced at only 1/30th of DeepSeek-R1), and GLM-Z1-Flash (free version) to meet the needs of different scenarios.