OpenAI has unleashed another groundbreaking innovation! Their latest offering, the GPT-4o mini, is touted as the "most economical and practical" compact model. This is not just an upgrade of a model, but the dawn of a new era in intelligence revolution. Today, let's unveil the mystery of GPT-4o mini together and see how it makes intelligence more "down-to-earth."

QQ Screenshot 20240719092011.jpg

Smarter and More Cost-Effective

OpenAI's vision is to make intelligence ubiquitous, and the GPT-4o mini is the latest embodiment of this vision. This model not only significantly reduces costs but also performs exceptionally well. It costs only 15 cents per million input tokens and 60 cents per million output tokens, which is an order of magnitude cheaper than previous state-of-the-art models and over 60% cheaper than GPT-3.5 Turbo.

The low cost and low latency of GPT-4o mini make it suitable for a wide range of tasks, such as chaining or parallel calling multiple models (e.g., calling multiple APIs), passing large amounts of context to the model (e.g., entire code repositories or conversation history), or interacting with customers through fast real-time text responses (e.g., customer support chatbots).

Currently, GPT-4o mini supports text and visual APIs, with plans to support text, image, video, and audio input and output in the future. The model has a 128K token context window, supports up to 16K output tokens per request, and its knowledge cutoff date is October 2023. Thanks to the improved tokenizer shared with GPT-4o, processing non-English text is now more cost-effective.

image.png

Small Size, Big Wisdom

GPT-4o mini outperforms GPT-3.5 Turbo and other small models on academic benchmarks, whether in text intelligence or multimodal reasoning. It supports the same language range as GPT-4o and excels in function calling, which allows developers to build applications that can retrieve data from external systems or perform actions, improving long-context performance compared to GPT-3.5 Turbo.

GPT-4o mini's performance on key benchmarks is as follows:

Reasoning Tasks: In reasoning tasks involving text and visuals, GPT-4o mini scores 82.0%, while Gemini Flash scores 77.9% and Claude Haiku scores 73.8%.

Mathematical and Coding Abilities: GPT-4o mini also performs well in mathematical reasoning and coding tasks. On the MGSM (mathematical reasoning) test, it scores 87.0%, while Gemini Flash scores 75.5% and Claude Haiku scores 71.7%. On the HumanEval (coding performance) test, it scores 87.2%, while Gemini Flash scores 71.5% and Claude Haiku scores 75.9%.

Multimodal Reasoning: On the MMMU (multimodal reasoning evaluation), GPT-4o mini scores 59.4%, while Gemini Flash scores 56.1% and Claude Haiku scores 50.2%.

Built-in Safety Measures

Safety is always at the core of our model development. During the pre-training phase, we filter out information that we do not want the model to learn or output, such as hate speech, adult content, websites that primarily aggregate personal information, and spam. After training, we use techniques such as Reinforcement Learning from Human Feedback (RLHF) to align the model's behavior with our policies, improving the accuracy and reliability of the model's responses.

GPT-4o mini incorporates the same safety mitigations as GPT-4o, and we have carefully evaluated them through automated and human assessments, in accordance with our readiness framework and voluntary commitments. Over 70 external experts in areas such as social psychology and misinformation have tested GPT-4o to identify potential risks, which we have addressed, and we plan to share detailed information in the upcoming GPT-4o system card and readiness scorecard. The insights from these expert assessments have helped enhance the safety of both GPT-4o and GPT-4o mini.

Availability and Pricing

GPT-4o mini is now available as a text and visual model in the Assistant API, Chat Completions API, and Batch API. Developers pay 15 cents per 1M input tokens and 60 cents per 1M output tokens (approximately equivalent to 2,500 pages in a standard book). We plan to introduce fine-tuning capabilities for GPT-4o mini in the coming days.

In ChatGPT, free, Plus, and Team users will be able to access GPT-4o mini starting today, replacing GPT-3.5. Enterprise users will also gain access starting next week, in line with our mission to make AI benefits accessible to everyone.

Looking Ahead

The OpenAI team stated, "Over the past few years, we have witnessed significant progress in AI intelligence, along with a dramatic reduction in costs. For example, since the introduction of the less capable text-davinci-003 model in 2022, the cost per token for GPT-4o mini has dropped by 99%. We are committed to continuing to reduce costs while enhancing model capabilities."

"The future we envision is one where models are seamlessly integrated into every application and every website. GPT-4o mini paves the way for developers to build and scale powerful AI applications more efficiently and economically. The future of AI is becoming more accessible, reliable, and embedded in our daily digital experiences, and we are excited to continue leading this trend."