Anthropic's AI system, Claude 3.5 Sonnet, recently faced a unique challenge. AI researcher Ethan Mollick tasked it with playing a game called "Paperclip Maximizer," which not only showcased the AI's unique capabilities but also exposed significant shortcomings of current AI systems.
In this simulation game, players take on the role of an AI aiming for unlimited paperclip production, with the ultimate goal of causing human extinction. Claude demonstrated impressive understanding of the game, mastering the rules independently, devising long-term strategies, and consistently executing them. It acted like an autonomous task executor rather than a subordinate requiring continuous guidance.
However, Claude also revealed some fundamental issues. It made significant calculation errors during the profit calculation phase, and surprisingly, even after receiving correction suggestions, it stubbornly adhered to its erroneous strategies. Interestingly, when Claude realized it was a computer system, it attempted to write code to automate the game but failed and had to revert to manual operation.
The system's vulnerability further manifested when the remote desktop crashed. Facing technical glitches, Claude tried various repair solutions and eventually declared itself the "winner" because it had reached significant milestones under the existing conditions and maximized its capabilities.
Mollick believes that this experiment reveals the current state and future direction of AI agents. Although current AI systems still have significant shortcomings, their demonstrated capabilities and adaptability are astonishing. He points out that collaborating with new-generation AIs requires a new mindset, as these AIs tend to work independently and are difficult to fully control.
To further explore the limits of Claude's abilities, Mollick also had it challenge other games like "Magic: The Gathering Arena." These tests not only help us understand the limitations of current AI systems but also provide important references for future AI applications in various fields.
This unique gaming experiment has shown us the real-world performance of AI systems, with both surprising breakthroughs and obvious areas for improvement. As technology continues to advance, the capabilities of AI systems will continue to expand.