On April 15th, OpenAI officially announced the release of the GPT-4.1 series models on its official blog. This series includes three sub-models: GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano. This series boasts significant breakthroughs in coding capabilities, instruction understanding, and long-text processing, surpassing its predecessors, GPT-4o and GPT-4o mini. Notably, the context window has been expanded to 1 million tokens, and the knowledge base has been updated to June 2024, providing stronger support for complex tasks.

Currently, the GPT-4.1 series is only available to developers via API access. Regular users cannot directly experience it through the ChatGPT interface. OpenAI revealed that GPT-4.1 boasts a 40% increase in code generation speed compared to GPT-4o, while simultaneously reducing user query costs by 80%, significantly improving development efficiency and cost-effectiveness.

OpenAI releases the new GPT-4.1 series models!  Smarter and cheaper than GPT-4o

OpenAI releases the new GPT-4.1 series models!  Smarter and cheaper than GPT-4o

Performance: Record-Breaking Benchmark Tests

  • Coding Ability: In the SWE-bench Verified test, GPT-4.1 achieved a score of 54.6%, a 21.4 percentage point improvement over GPT-4o;
  • Instruction Following: A 10.5 percentage point improvement in the MultiChallenge test;
  • Multimodal Processing: Achieved a new high of 72.0% in the Video-MME test.
  • GPT-4.1 mini's performance in several tests approached or even surpassed GPT-4o, with latency reduced by nearly 50% and costs reduced by 83%. GPT-4.1 nano, as a lightweight version, offers a 1 million token context window and an MMLU score of 80.1%, making it a cost-effective choice for classification and auto-completion tasks. Through inference stack optimization and prompt caching technology, the series' initial response time has been significantly shortened, providing developers with a highly efficient and low-cost solution.

Significant Real-World Application Results

  • Coding Efficiency: Windsurf tests show a 30% increase in coding efficiency and a 50% reduction in ineffective edits for GPT-4.1;
  • Legal Field: After integrating GPT-4.1, Thomson Reuters' legal AI assistant, CoCounsel, saw a 17% increase in multi-document review accuracy.

The input cost for GPT-4.1 is $2 per million tokens, and the output cost is $8. In moderate query scenarios, GPT-4.1 offers improved performance compared to GPT-4o while reducing costs by 26%. GPT-4.1 nano, with its ultra-low latency and cost, is currently OpenAI's most economical model option.