Zhipu AI's technology team has announced the open-sourcing of its 32B and 9B series of GLM (General Language Model) models and the launch of its new interactive experience platform, Z.ai. This series includes base, inference, and rumination models, all released under the permissive MIT license, granting developers significant freedom in usage and development, including free commercial use and redistribution.

The open-sourced 32B base model, GLM-4-32B-0414, boasts 32 billion parameters and was pre-trained using 15 terabytes of high-quality data, incorporating a wealth of reasoning-oriented synthetic data. Post-training techniques like rejection sampling and reinforcement learning significantly enhanced its performance on tasks such as instruction following, code generation, and function calling. In some benchmarks, it approaches or even surpasses larger models like GPT-4o and DeepSeek-V3-0324 (671B). Furthermore, GLM-4-32B-0414 exhibits improved code generation capabilities, handling and generating more complex single-file code. Z.ai's conversational mode includes a preview function, enabling users to visualize generated HTML and SVG for easier evaluation and iterative refinement.

微信截图_20250415090652.png

The inference model, GLM-Z1-32B-0414, builds upon GLM-4-32B-0414, employing a cold-start and extended reinforcement learning strategy, with deep optimization training for crucial tasks like mathematics, code, and logic. Its performance in certain tasks rivals that of the 671B parameter DeepSeek-R1, demonstrating strong mathematical reasoning capabilities and the ability to handle more complex tasks. Significantly, GLM-Z1-32B-0414 boasts an inference speed of 200 tokens/second, the fastest among commercially available models in China, at a cost only 1/30th that of DeepSeek-R1.

The 9B GLM-Z1-9B-0414 model leverages the same technologies. Despite its smaller parameter count, it excels in mathematical reasoning and general tasks, achieving top performance among similarly sized open-source models. Its efficiency makes it ideal for resource-constrained environments, offering a powerful option for lightweight deployments.

The rumination model, GLM-Z1-Rumination-32B-0414, represents Zhipu's exploration of future AGI (Artificial General Intelligence). Unlike typical inference models, it solves highly open and complex problems through more in-depth deliberation. Its key innovation lies in its ability to integrate search tools during deep thinking to handle complex tasks, using various rule-based reward mechanisms to guide and extend end-to-end reinforcement learning training. It supports a complete research cycle – "pose questions – search information – build analysis – complete tasks" – significantly improving its capabilities in research writing and complex retrieval tasks.

微信截图_20250415090630.png

Beyond open-sourcing, base and inference models are available via Zhipu's MaaS (Model-as-a-Service) platform (bigmodel.cn), offering API services to businesses and developers. Two base model versions are available: GLM-4-Air-250414 and GLM-4-Flash-250414, with the latter being completely free. Three inference model versions cater to different needs: GLM-Z1-AirX (extremely fast) is positioned as China's fastest inference model, achieving 200 tokens/second, eight times faster than conventional models; GLM-Z1-Air (high cost-performance) costs only 1/30th of DeepSeek-R1, suitable for high-frequency calls; and GLM-Z1-Flash (free) offers free usage to lower the barrier to entry.

Zhipu has also launched the new domain Z.ai, integrating 32B base, inference, and rumination GLM models. This platform serves as the interactive experience portal for Zhipu's latest models, currently featuring three open-source models for users to explore.