GPT-4.5 Passes Turing Test with Persona: AI Conversational Abilities Reach New Heights

AIbase基地

Published inAI News · 6 min read · Apr 2, 2025

335

Recent research led by the Department of Cognitive Science at the University of California, San Diego, marks a milestone in artificial intelligence: OpenAI's latest model, GPT-4.5, has achieved a superhuman performance in the standard Turing test using "persona-based" interaction, becoming the most human-like AI conversational system to date. This achievement not only reshapes our understanding of AI's language capabilities but also opens up new possibilities for AI applications in social intelligence.

The experiment compared four representative AI systems: the classic 1960s chatbot ELIZA, Meta AI's LLaMa-3.1-405B, and OpenAI's GPT-4o and GPT-4.5. Researchers designed two independent tests, each with 250 participants from online platforms like Prolific, totaling 500 participants. Participants represented diverse age groups, genders, and educational backgrounds, ensuring sample diversity. The classic three-party Turing test format was used: each judge interacted via text interface with two conversational entities (one human, one AI) for 5 minutes, then judged which was human.

The results were striking. GPT-4.5 achieved a 73% "human-passing" rate in both tests, exceeding human success rates (typically 60%-70%), becoming the first AI model to truly "pass" a standard Turing test. In comparison, GPT-4o had a slightly lower passing rate, LLaMa-3.1-405B approached or even reached human-level performance in some settings, while ELIZA lagged significantly. Researchers noted GPT-4.5's impressive natural language fluency and emotional richness, adapting responses to the judge's tone, often described as "friendly" or "authentic" by participants.

More noteworthy is GPT-4.5's demonstration of "human-like social intelligence." The research team suggests the model quickly grasps emotional cues in short conversations and responds in ways that align with human social expectations, even surpassing human performance in some contexts. For instance, when judges showed confusion or emotional distress, GPT-4.5 provided comforting or humorous responses. This nuanced interaction fooled many participants into believing they were conversing with a real person.

In contrast, LLaMa-3.1-405B, while technically impressive, showed slightly weaker emotional expression and contextual adaptability. However, its near-human performance in specific settings highlights the potential of open-source models in the AI race. GPT-4o, the predecessor to GPT-4.5, demonstrated considerable capabilities but lagged behind in personalized expression and dynamic adjustments.

Industry experts attribute GPT-4.5's success to its training incorporating more complex persona-based mechanisms and conversational strategies. Unlike the "improvisational generation" of traditional language models, GPT-4.5 seems to create a "predictive framework" before a conversation and dynamically optimizes responses based on real-time feedback. This makes it exceptionally "clever" in short exchanges, masking its inherent mechanical nature. However, this raises questions about whether the Turing test remains the ultimate measure of AI intelligence. Some scholars argue GPT-4.5's success relies more on mimicking human social behavior than true understanding or autonomous thought.

Regardless, GPT-4.5's breakthrough revitalizes AI development. Its human-like conversational abilities could lead to more practical applications, from educational tutoring and psychological support to customer service. Its high passing rate also reminds us that as AI becomes more human-like, discerning reality from simulation and regulating its use will be crucial societal challenges.

This research release coincides with rapid AI iteration. GPT-4.5's emergence is not just a technical victory for OpenAI but also a profound questioning of the human-machine relationship. As one participant remarked, "It felt like I was chatting with a friend—until I realized it was all code magic." In this ongoing dialogue between humans and AI, the real test may have just begun.

Paper Link: https://arxiv.org/pdf/2503.23674

OpenAI Releases gpt-image-1 API: 4o Image Generation Capabilities Now Open

OpenAI has officially launched the gpt-image-1 API, marking the opening of its highly anticipated 4o image generation capabilities to developers. According to AIbase, this API is lauded by the community as the world's strongest 'image generation' tool due to its high-fidelity image generation, diverse visual styles, and powerful integration of world knowledge. The release announcement has generated significant excitement among AI developers and the creative community, with relevant documentation now publicly available via the OpenAI website and Playground platform. Core features: High-fidelity and diverse style generation

OpenAI Predicts $125 Billion Revenue by 2029, 3 Billion Monthly Active Users by 2030

OpenAI recently released a prediction forecasting $125 billion in total revenue by 2029. AI agent and channel revenue will be key drivers. AI agent revenue is projected to reach nearly $29 billion, representing almost a quarter of total revenue, while channel revenue is expected to reach $25 billion. Image note: Image generated by AI, image licensing service Midjourney. Following the success of ChatGPT, OpenAI's...

OpenAI Launches New ChatGPT Image Generation API: Developers Can Easily Integrate AI Image Creation Functionality

OpenAI recently announced that it has made its latest image generation capabilities available to developers via API, allowing them to integrate this advanced technology into various applications and services. This news offers developers a significant opportunity, particularly in the fields of image processing and creation. The newly launched image generation model, named "gpt-image-1," leverages the image generation technology behind ChatGPT. Since its launch at the end of March this year, users have been able to create realistic Ghibli-style images and various other visuals.

OpenAI's New GPT-4.1 Model Faces Challenges in Alignment

OpenAI recently released its latest AI model, GPT-4.1, claiming superior instruction following. However, independent tests suggest a decline in alignment, i.e., reliability, compared to its predecessor, GPT-4. OpenAI typically releases detailed technical reports including safety evaluations with new models, but hasn't done so this time, explaining that GPT-4.1 is not considered a 'cutting-edge' model.

OpenAI: We're Interested in Acquiring Chrome If Google Is Forced to Sell!

In a recent antitrust trial against Google in Washington, OpenAI executive Nick Turley revealed that OpenAI would be interested in acquiring Chrome should the court rule that Google must divest itself of the browser to restore competition in the search market. This statement highlights OpenAI's focus on search functionality and its ambitious future plans. Turley emphasized the importance of search...