Zhipu AI Releases Chinese LLM Alignment Evaluation Benchmark AlignBench

站长之家

Published inAI News · 2 min read · Dec 12, 2023

298

Translated data: Zhipu AI has released AlignBench, a specialized evaluation benchmark designed for Chinese large language models (LLMs), which is the first of its kind to assess the alignment of these models with human intent in multiple dimensions. AlignBench's dataset is sourced from real-world scenarios and undergoes steps such as initial construction, sensitivity screening, reference answer generation, and difficulty filtering to ensure authenticity and challenge. The dataset is categorized into 8 major types, including knowledge quizzes, writing generation, role-playing, and more. To achieve automation and reproducibility, AlignBench employs scoring models (like GPT-4 and CritiqueLLM) to rate each model's responses, representing their quality. These scoring models use a multi-dimensional, rule-calibrated scoring method, enhancing the consistency between model scores and human scores, and providing detailed evaluation analysis and scores. Developers can utilize AlignBench for evaluations and employ high-capability scoring models (such as GPT-4 or CritiqueLLM) for ratings. Through the AlignBench website, submission results can be evaluated using CritiqueLLM as the scoring model, with evaluation results typically available within about 5 minutes.

Chinese Large Models Alignment Evaluation Benchmark Zhipu AI

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Tencent Video Launches AI Repair for Classic Film and Television Works to Restore 4K Quality

Tencent Video launched AI-enhanced 4K classic shows and movies for SVIP users, including 'Home with Kids' and 'Ne Zha Legend', accessible via 'MAX' option.....

Sep 19, 2025

160

Hong Kong's Ultrasound Field Sees an AI Revolution! New Large Model Helps Doctors Diagnose Easily

Hong Kong launched EchoCare, the first ultrasound AI model trained on 4M+ images, to address the shortage of 150K sonographers in China, improving efficiency and diagnosis.....

Sep 19, 2025

110

Volc Engine Dominates the Market! Analysis of Large Model Services in China's Public Cloud in 2025

IDC predicts China's public cloud AI model calls will hit 536.7T tokens by H1 2025, with Volcano Engine leading at 49.2% market share, followed by Alibaba Cloud (27%) and Baidu AI Cloud (17%).....

Sep 19, 2025

190

AI Daily: Xiaomi Opensources Its First Native End-to-End Speech Large Model; Tongyi Wanxiang Wan2.2-Animate Officially Open-Sourced; Suno v5 to Launch Soon

Xiaomi open-sourced its first native end-to-end speech model, Xiaomi-MiMo-Audio, showcasing breakthroughs in speech tech with strong few-shot generalization and outperforming proprietary models in benchmarks.....

Sep 19, 2025

180

Shengshu Technology Secures Several Billion Yuan in Funding, Driving New Trends in AI Commercialization through Video Generation

Recently, Shengshu Technology, a leading company in the field of multimodal AI, announced the successful completion of an A-round funding round worth several billion yuan. This round was led by Bohua Capital, with existing investors such as Baidu's strategic investment division and the Beijing Artificial Intelligence Industry Investment Fund continuing to participate, demonstrating strong market recognition of Shengshu Technology. The company plans to use the funds to further advance model R&D and technological innovation, explore the potential of multimodal large models, and accelerate product expansion and user services. Multimodal technology, especially in the field of video generation, is currently experiencing rapid development.

Sep 19, 2025

Google Chrome Browser Adds New AI Features, How Should Internet Users Respond?

Google recently announced that the Chrome browser will undergo its largest update to date, primarily by adding AI features to enhance user experience. This update will be rolled out today to macOS and Windows users in the United States, with users who have English settings being the first to experience these new features. Mike Torres, Vice President of Google Products, stated that the core of this update is 'Geminiization,' and users can now access AI capabilities for web pages through a newly added Gemini button.

Sep 19, 2025

Suno v5 Music Model is About to Launch, AI Music Creation is About to Experience a Revolutionary Upgrade

Suno recently sparked global discussions through a mysterious teaser video: its fifth-generation music model 'v5' is about to be released. This announcement is seen by the industry as a 'revolutionary' milestone in AI music creation, and is expected to further blur the boundaries between human composition and machine-generated music, significantly lowering the barriers to entry for creators from amateur enthusiasts to professional producers. Suno officially posted a 15-second short video on social media at night on September 18th, showing abstract notes and interwoven light and shadow, accompanied by a deep electronic melody, ending with 'coming soon'.

Sep 19, 2025

680

Tongyi Wanxiang's New Action Generation Model Wan2.2-Animate Officially Open-Sourced

On September 19, 2025, Alibaba Cloud announced the official open-sourcing of Tongyi Wanxiang's new action generation model Wan2.2-Animate. This model can drive photos of people, anime characters, and animals, and is widely applied in short video creation, dance template generation, and animation production. Users can download the model and code on GitHub, HuggingFace, and the Moda Community, or call the API through the Alibaba Cloud BaiLian platform or experience it directly on the Tongyi Wanxiang website. Wan2.2-Animate

Sep 19, 2025

210

Tencent HuanYuan 3D Studio Makes a Stunning Debut: 3D Creation Speeds Up from Days to Minutes

On September 19, 2025, Tencent launched HuanYuan 3D Studio, an AI workbench specifically designed for 3D designers, game developers, and modelers. This is Tencent's second major release within a week. The platform reduces the 3D asset production cycle from days to minutes, achieving a revolutionary improvement in production efficiency. A one-stop platform covers the entire creative process. The initial version of HuanYuan 3D Studio has been launched, featuring character and prop creation pipelines, integrating the entire workflow from concept design, geometric modeling, to texture mapping, skinning, and animation production. The platform is based on

Sep 19, 2025

370

Musk's AI Company Faces Intensifying Power Struggle, Multiple Executives Leave Due to Discontent with Management Style

Recently, Elon Musk's AI company xAI has experienced a management crisis, with multiple executives leaving due to dissatisfaction with the company's management style and financial situation. Currently, the daily operations of xAI are handled by two close advisors of Musk, Jared Bunch and John Hering, and all major decisions still require Musk's approval. Image source note: The image was generated by AI, provided by the licensing service Midjourney. A source revealed that some executives of xAI expressed dissatisfaction with Bunch and Hering in internal meetings.

Sep 19, 2025

120

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

Zhipu AI Releases Chinese LLM Alignment Evaluation Benchmark AlignBench

站长之家

This article is from AIbase Daily

AI News Recommendations

Tencent Video Launches AI Repair for Classic Film and Television Works to Restore 4K Quality

Hong Kong's Ultrasound Field Sees an AI Revolution! New Large Model Helps Doctors Diagnose Easily

Volc Engine Dominates the Market! Analysis of Large Model Services in China's Public Cloud in 2025

AI Daily: Xiaomi Opensources Its First Native End-to-End Speech Large Model; Tongyi Wanxiang Wan2.2-Animate Officially Open-Sourced; Suno v5 to Launch Soon

Shengshu Technology Secures Several Billion Yuan in Funding, Driving New Trends in AI Commercialization through Video Generation

Google Chrome Browser Adds New AI Features, How Should Internet Users Respond?

Suno v5 Music Model is About to Launch, AI Music Creation is About to Experience a Revolutionary Upgrade

Tongyi Wanxiang's New Action Generation Model Wan2.2-Animate Officially Open-Sourced

Tencent HuanYuan 3D Studio Makes a Stunning Debut: 3D Creation Speeds Up from Days to Minutes

Musk's AI Company Faces Intensifying Power Struggle, Multiple Executives Leave Due to Discontent with Management Style

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

Zhipu AI Releases Chinese LLM Alignment Evaluation Benchmark AlignBench

站长之家

This article is from AIbase Daily

AI News Recommendations

Tencent Video Launches AI Repair for Classic Film and Television Works to Restore 4K Quality

Hong Kong's Ultrasound Field Sees an AI Revolution! New Large Model Helps Doctors Diagnose Easily

Volc Engine Dominates the Market! Analysis of Large Model Services in China's Public Cloud in 2025

AI Daily: Xiaomi Opensources Its First Native End-to-End Speech Large Model; Tongyi Wanxiang Wan2.2-Animate Officially Open-Sourced; Suno v5 to Launch Soon

Shengshu Technology Secures Several Billion Yuan in Funding, Driving New Trends in AI Commercialization through Video Generation

Google Chrome Browser Adds New AI Features, How Should Internet Users Respond?

Suno v5 Music Model is About to Launch, AI Music Creation is About to Experience a Revolutionary Upgrade

Tongyi Wanxiang's New Action Generation Model Wan2.2-Animate Officially Open-Sourced

Tencent HuanYuan 3D Studio Makes a Stunning Debut: 3D Creation Speeds Up from Days to Minutes

Musk's AI Company Faces Intensifying Power Struggle, Multiple Executives Leave Due to Discontent with Management Style

GEO Services