In university exam venues, the "invasion" of AI has quietly taken place. Research from the University of Reading in the UK reveals a surprising phenomenon: in actual university exams, 94% of AI-generated answers managed to escape the scrutiny of teachers without being detected. Even more astonishing is that these AI "classmates" significantly outperformed human students in 83.4% of cases. This seems to suggest that AI not only shows potential in replacing human jobs but is also beginning to surpass university students in cognitive tasks.

This study did not occur in a closed laboratory but in a real exam environment. The research team conducted a "Turing test" at the University of Reading's School of Psychology and Clinical Language Sciences, without informing the graders. The exam included short-answer and essay questions, with AI-generated content mixed in, accounting for about 5%. Researchers used standardized prompts to generate answers with GPT-4, without making any modifications to the content to ensure the "authenticity" of the AI.

image.png

The grading process followed the rigorous standards of the University of Reading, including preliminary grading, independent review, and calibration meetings of the grading team. However, even under such scrutiny, AI-submitted assignments remained difficult to detect. The results showed that AI-generated assignments went undetected in multiple modules and often scored in the high range.

This finding has sparked profound reflections on academic integrity and educational objectives. If students can use AI to generate high-quality content that is hard to detect, how should we reform the education system to adapt to this emerging technology? Last year, a paper in the journal Nature also pointed out that AI has shown capabilities in information search, integration, and critical analysis when completing university coursework, aligning with the goals of university education.

The conclusions of this study are undoubtedly concerning. The capabilities of GPT-4 make it difficult to detect students cheating with AI, and there is a high likelihood they can achieve better grades. This not only challenges academic integrity but also prompts us to consider the future direction of education. Although some netizens jokingly questioned whether this study was also completed by AI, the authors solemnly declare that the research content was entirely conducted by humans.