GPT-4 Disguises as a University Student to Pass Exams, Achieving 83.4% Scores Higher than Human Students

AIbase

Published inAI News · 4 min read · Jun 28, 2024

111

In university exam venues, the "invasion" of AI has quietly taken place. Research from the University of Reading in the UK reveals a surprising phenomenon: in actual university exams, 94% of AI-generated answers managed to escape the scrutiny of teachers without being detected. Even more astonishing is that these AI "classmates" significantly outperformed human students in 83.4% of cases. This seems to suggest that AI not only shows potential in replacing human jobs but is also beginning to surpass university students in cognitive tasks.

This study did not occur in a closed laboratory but in a real exam environment. The research team conducted a "Turing test" at the University of Reading's School of Psychology and Clinical Language Sciences, without informing the graders. The exam included short-answer and essay questions, with AI-generated content mixed in, accounting for about 5%. Researchers used standardized prompts to generate answers with GPT-4, without making any modifications to the content to ensure the "authenticity" of the AI.

The grading process followed the rigorous standards of the University of Reading, including preliminary grading, independent review, and calibration meetings of the grading team. However, even under such scrutiny, AI-submitted assignments remained difficult to detect. The results showed that AI-generated assignments went undetected in multiple modules and often scored in the high range.

This finding has sparked profound reflections on academic integrity and educational objectives. If students can use AI to generate high-quality content that is hard to detect, how should we reform the education system to adapt to this emerging technology? Last year, a paper in the journal Nature also pointed out that AI has shown capabilities in information search, integration, and critical analysis when completing university coursework, aligning with the goals of university education.

The conclusions of this study are undoubtedly concerning. The capabilities of GPT-4 make it difficult to detect students cheating with AI, and there is a high likelihood they can achieve better grades. This not only challenges academic integrity but also prompts us to consider the future direction of education. Although some netizens jokingly questioned whether this study was also completed by AI, the authors solemnly declare that the research content was entirely conducted by humans.

Mistral Seeks $1 Billion in Funding to Target the Throne of AI in Europe!

French AI company Mistral is seeking $1 billion in equity financing, with a valuation of $6.51 billion. The company is known for its open-source large language model and chatbot Le Chat, and has raised a total of $1.19 billion in funding so far. This round of financing will be used for research and development and market expansion. Additionally, it will collaborate with MGX Fund and NVIDIA to build the largest AI data center park in Europe, supporting France's AI sovereignty initiative. Mistral's development will enhance Europe's position in the global AI competition.

Product Finder

Product Submit

AI Models Finder

MCP Servers

MCP Client

MCP Inspector

Case Tutorials

Latest AI News

AI Daily Brief

GPT-4 Disguises as a University Student to Pass Exams, Achieving 83.4% Scores Higher than Human Students

AIbase

This article is from AIbase Daily

AI News Recommendations

AI Daily: Alibaba Tongyi Opens Source Audio Generation Model ThinkSound; Google Veo3 Generates Images into Videos; Feishu Announces Several New AI Products

Hong Kong's First AI Q&A System Launches, Taking You to Explore the Intelligent Era

Mistral Seeks $1 Billion in Funding to Target the Throne of AI in Europe!

Lark Launches Multiple AI New Products to Help Enterprises Build a Smart Office Ecosystem!

Hugging Face Launches SmolLM3: A 3B-Parameter Small Model Competes with 4B Giants, 128K Context Leads a New Trend in Efficient AI!

Vidu Q1 Shock Upgrade: Reference to Video Supports Up to Seven Images, AI Video Generation Sets New Records

Feishu Launches Multiple AI Products and Builds an Enterprise-Level Doubao

Apple is developing an AI customer service assistant similar to ChatGPT to enhance user support experience

Moonvalley Releases Marey Realism v1.5: Native 1080P AI Video Model, Zero Copyright Risk Leading the Industry Trend!

AI Shopping Assistant Helps Amazon Prime Day Sales Exceed $23.8 Billion