The recently launched o3AI model by OpenAI is considered its most powerful artificial intelligence product, but the operating costs are astonishing, with a single task costing over $1000.
According to a report by TechCrunch, this new model employs a technique called "computation during testing" when dealing with complex problems, meaning it spends more time thinking and exploring various possibilities before arriving at an answer. As a result, OpenAI engineers hope that o3 can generate higher-quality responses under complex prompts.
According to François Chollet, the founder of the ARC-AGI benchmark test, o3 scored 87.5% under its powerful "high computation mode," nearly three times the 32% score of the previous generation o1 model. This indicates a significant performance improvement for o3. However, this refined computational process comes with a huge cost. To achieve this high score, the computational cost of o3 exceeds $1000 per task, utilizing 170 times more computing power than the low-power version of o3, which was less than $4 per task.
This situation has raised concerns in the industry about the contradiction between the performance of the o3 model and its operating costs. On one hand, the substantial increase in o3's score seems to demonstrate that AI models can still progress through "scaling," which involves increasing processing power and training data. On the other hand, criticisms regarding diminishing returns from scaling are also on the rise. While o3's improvements are primarily attributed to enhancements in its "reasoning" capabilities rather than mere scaling, its high operating costs undoubtedly raise concerns.
Even the low-computation version of o3 scored 76% in benchmark tests, but the cost per task reached around $20. Although this is still relatively cheap compared to other options, it is still several times more expensive than its predecessor. Moreover, considering that ChatGPT Plus charges only $25 per month, OpenAI faces significant cost pressure in enhancing the intelligence level of user interactions.
In a blog discussing the benchmark results, Chollet pointed out that while o3 performs at a level close to human capability, "the costs are still high and not yet economical." He noted that the labor cost to solve ARC-AGI tasks is about $5 per task, while energy costs are just a few cents. However, he optimistically believes that "cost-effectiveness may significantly improve in the coming months and years." Currently, o3 has not been released to the public, and its "mini version" is expected to launch in January next year.
Key Points:
🌟 The cost of a single query with the o3AI model exceeds $1000, highlighting its high operating expenses.
📊 In the ARC-AGI benchmark test, o3 scored 87.5%, nearly three times that of the previous generation o1 model.
🔍 Currently, o3 has not been released to the public, and its "mini version" is expected to launch in January next year.