9 AI Models Challenge the Toughest Henan Exam of the 2024 Gaokao, Doubao Takes the Domestic Lead

AIbase

Published inAI News · 5 min read · Jun 26, 2024

860

During the 2024 college entrance examination season, nine AI large models bravely faced an unprecedented challenge—participating in the national college entrance examination, particularly the highly challenging New Curriculum Standard Volume I: Henan Paper. This test, initiated by the media, not only assessed the academic capabilities of AI but also offered a unique perspective on the differences between AI and human intelligence.

Among the nine AI models tested, four exceeded the first-tier undergraduate line of the Henan college entrance examination. GPT-4o scored 562, taking the top spot and surpassing the first-tier line by 41 points, while ByteDance's Doubao closely followed with 542.5 points, standing out as a top domestic model.

Robots Taking Exams - Robots in College Entrance Exams

Image Source Note: The image was generated by AI, authorized service provider Midjourney

AI performed exceptionally well in liberal arts subjects, especially in Chinese and English, but less impressively in science subjects, particularly mathematics. It is evident that AI has a clear advantage in language subjects, with impressive ancient poetry comprehension abilities.

AI performed adequately on simple reasoning questions but poorly on those requiring complex derivations and proofs, indicating a need for improvement in logical capabilities. In the comprehensive liberal arts section, geography performed the worst, while in the comprehensive science section, biology performed relatively better. GPT-4o stood out with a high score of 91.5 in the politics subject.

Testing Method and Scoring Criteria

Test Rounds: To reduce the impact of randomness, all subjects were tested twice, with the average score serving as the final result.

Input Format: Formulas were input in Markdown/LaTeX format, and image-based questions were input based on the model's recognition capabilities with corresponding images and text.

Test Operation: Professional AI data service providers conducted standardized test screenshots to ensure the fairness of the test.

Scoring Method: The same scoring standards as human candidates were used to ensure the fairness of the scoring.

This attempt for AI to participate in the college entrance examination not only showcased AI's advantages in specific fields but also exposed its shortcomings in logical reasoning and mathematical proofs. As one AI candidate quoted in an essay: "The journey is long and arduous, and I will seek knowledge both above and below." This not only reflects the development of AI but also vividly describes humanity's continuous exploration of the unknown world. Through this test, we gained a deeper understanding of AI's intellectual level and provided valuable insights for its future development direction.

The candidate list included well-known AI products such as OpenAI's GPT-4o, ByteDance's Doubao, and Baidu's Wenxin 4.0. Their performance in this college entrance examination will undoubtedly have a profound impact on the development of AI technology.

Gaokao Large Models AI Headlines

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Moonshot AI Releases and Opensources Kimi K2 Model, Strong in Code and Agentic Tasks

Moonshot AI officially released its latest creation - the Kimi K2 model, and simultaneously announced its open source. This foundation model based on the MoE architecture has gained widespread attention in the AI field since its release, thanks to its strong coding capabilities and excellent general Agent task processing abilities. The Kimi K2 model has a total of 1T parameters, with 32B activated parameters. It has achieved top performance among open-source models in a series of benchmark performance tests such as SWE Bench Verified, Tau2, and AceBench.

Jul 12, 2025

330

Mafengwo AI Itinerary Fully Opened, AI Travel Assistant Adds New Practical Features

Jul 11, 2025

130

AI Daily: Zhipu Launches PPT Generation Function AI Slides; Ke Ling AI Releases Ketur 2.1 Model

1. Zhipu launches free AI Slides for PPT generation. 2. Keling AI introduces KeTu 2.1 with 180 styles. 3. NVIDIA's DiffusionRenderer enables 3D scene editing. 4. Modao AI offers 30-second prototype generation. 5. Higgsfield creates avatars from 10 photos. 6. Google open-sources GenAI Processors. 7. Google Veo3 adds image-to-video. 8. Mistral AI releases Devstral2507 for code generation.....

Jul 11, 2025

Google DeepMind Open Sources GenAI Processors: One-Click Building of Real-Time AI Workflows

Google DeepMind open sources the GenAI Processors Python library, helping developers build efficient generative AI workflows. The library supports asynchronous processing of multimodal data and optimizes Gemini API application development, significantly reducing latency in real-time applications. Core features include a modular Processor interface, streaming API design, and concurrency optimization, enabling rapid development of real-time applications such as intelligent assistants. Currently only supports Python, but with an open community contribution model, future plans include expanding functionality to cover more scenarios.

Jul 11, 2025

170

Manus AI Official Website and Social Media Undergo Changes, Chinese Users May Be Affected

General AI company Manus adjusts its China operations, lays off employees, and relocates its core technology team to Singapore. The China region had approximately 120 employees, and the company states this move is aimed at improving operational efficiency and focusing on core business. The official website now shows that the region is unavailable, replacing previous messages about the development of the Chinese version. The official Weibo and Xiaohongshu accounts have also been cleared, indicating a significant shift in the company's market strategy in China.

Jul 11, 2025

140

Modo AI Launches: Input Your Idea and Generate a High-Fidelity, Editable Prototype in 30 Seconds

Modo AI introduces a 30-second rapid prototype generation feature, supporting multi-device adaptation and conversation optimization. Users can generate high-fidelity, editable prototypes through text, sketches, and other input methods, and support iterative conversation adjustments. The AI can intelligently parse uploaded sketches, wireframes, and more, automatically generating interfaces. It offers dual-mode editing, automatic documentation generation, and code integration features, covering multiple scenarios such as e-commerce and social networking, significantly lowering the barrier to prototype creation and improving product design efficiency.

Jul 11, 2025

170

Mistral AI Releases Devstral2507: Designed for Code-Centric Language Modeling

Mistral AI launched the Devstral2507 series with two AI models: the open-source Devstral Small1.1 (24 billion parameters, SWE-Bench score of 53.6%) and the enterprise version Devstral Medium2507 (score of 61.6%). Small1.1 supports a 128k context window and local deployment, while Medium2507 outperforms some commercial models. Both are optimized for code reasoning and program synthesis, and support integration with agent frameworks.

Jul 11, 2025

130

Generate a Professional PPT in 5 Minutes! Zhipei AI Slides Has Been Launched, GLM-Experimental Brings You a Glimpse of the Future of Work

Zhipu AI launches AI Slides, a revolutionary PPT tool using GLM-Experimental model. It generates professional slides from text/documents with smart layouts and visual optimization. Free for business/academic use, praised for design quality and efficiency. Available on Zhipu's official site.....

Jul 11, 2025

210

AWS Intensifies Infrastructure in AI Competition, SageMaker Platform Receives Major Upgrade

AWS upgraded SageMaker with model observability and local IDE integration. HyperPod now monitors training performance and connects local dev environments to cloud. GPU cluster management was optimized for flexible resource allocation.....

Jul 11, 2025

120

Musk's New AI Chatbot Grok 4: Pursuing Truth or Advocating Personal Opinions?

Musk's xAI launched Grok4 AI chatbot, promoting 'truth-seeking' but sparking controversy. Tests show it often cites Musk's views on sensitive topics like Israel-Palestine conflict and immigration. Grok previously faced anti-Semitic content issues, highlighting risks of linking AI to founder's opinions. While Grok4 outperforms rivals in some tests, frequent errors and lack of transparency may hinder commercialization. xAI is promoting $300/month s....

Jul 11, 2025

100

Product Finder

Product Submit

AI Models Finder

MCP Servers

MCP Client

MCP Inspector

Case Tutorials

Latest AI News

AI Daily Brief

9 AI Models Challenge the Toughest Henan Exam of the 2024 Gaokao, Doubao Takes the Domestic Lead

AIbase

This article is from AIbase Daily

AI News Recommendations

Moonshot AI Releases and Opensources Kimi K2 Model, Strong in Code and Agentic Tasks

Mafengwo AI Itinerary Fully Opened, AI Travel Assistant Adds New Practical Features

AI Daily: Zhipu Launches PPT Generation Function AI Slides; Ke Ling AI Releases Ketur 2.1 Model

Google DeepMind Open Sources GenAI Processors: One-Click Building of Real-Time AI Workflows

Manus AI Official Website and Social Media Undergo Changes, Chinese Users May Be Affected

Modo AI Launches: Input Your Idea and Generate a High-Fidelity, Editable Prototype in 30 Seconds

Mistral AI Releases Devstral2507: Designed for Code-Centric Language Modeling

Generate a Professional PPT in 5 Minutes! Zhipei AI Slides Has Been Launched, GLM-Experimental Brings You a Glimpse of the Future of Work

AWS Intensifies Infrastructure in AI Competition, SageMaker Platform Receives Major Upgrade

Musk's New AI Chatbot Grok 4: Pursuing Truth or Advocating Personal Opinions?