DeepSeek Launches New Inference Model DeepSeek-R1 Comparable to OpenAI-o1

AIbase基地

Published inAI News · 6 min read · Jan 21, 2025

876

Recently, DeepSeek announced the launch of its first inference model trained through Reinforcement Learning (RL), DeepSeek-R1, which achieved performance comparable to OpenAI-o1-1217 in several inference benchmark tests. DeepSeek-R1 is based on the DeepSeek-V3-Base model and utilizes multi-stage training and cold start data to enhance inference capabilities.

DeepSeek researchers first developed DeepSeek-R1-Zero, a model that was entirely trained through large-scale reinforcement learning without any supervised fine-tuning preparatory steps. DeepSeek-R1-Zero demonstrated exceptional performance in inference benchmark tests, such as achieving a pass@1 score of 71.0% in the AIME2024 exam, up from 15.6%.However, DeepSeek-R1-Zero also faced issues such as poor readability and mixed language output.

To address these issues and further enhance inference performance, the DeepSeek team developed DeepSeek-R1. DeepSeek-R1 introduced multi-stage training and cold start data prior to reinforcement learning. Specifically, researchers first collected thousands of cold start data pairs to fine-tune the DeepSeek-V3-Base model. Then, they conducted inference-focused reinforcement learning similar to how they trained DeepSeek-R1-Zero. As the reinforcement learning process approached convergence, they created new supervised fine-tuning data through rejection sampling of reinforcement learning checkpoints, combining it with supervised data from DeepSeek-V3 in areas such as writing, fact-based Q&A, and self-awareness, and then retrained the DeepSeek-V3-Base model. Finally, additional reinforcement learning was performed on the fine-tuned checkpoints using prompts from all scenarios.

DeepSeek-R1 achieved impressive results across multiple benchmark tests:

• In the AIME2024 exam, DeepSeek-R1 reached a pass@1 score of 79.8%, slightly surpassing OpenAI-o1-1217.

• In the MATH-500 exam, DeepSeek-R1 achieved a pass@1 score of 97.3%, matching OpenAI-o1-1217.

• In coding competition tasks, DeepSeek-R1 obtained a 2029 Elo rating on Codeforces, outperforming 96.3% of human competitors.

• In knowledge benchmark tests (such as MMLU, MMLU-Pro, and GPQA Diamond), DeepSeek-R1 scored 90.8%, 84.0%, and 71.5%, significantly surpassing DeepSeek-V3.

• In other tasks (such as creative writing, general Q&A, editing, summarization, etc.), DeepSeek-R1 also performed exceptionally well.

Additionally, DeepSeek has explored distilling the inference capabilities of DeepSeek-R1 into smaller models. Research found that distilling directly from DeepSeek-R1 is more effective than applying reinforcement learning to smaller models. This indicates that the inference patterns discovered by large foundational models are crucial for enhancing inference capabilities.DeepSeek has open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models distilled from DeepSeek-R1 based on Qwen and Llama (1.5B, 7B, 8B, 14B, 32B, 70B). The launch of DeepSeek-R1 marks significant progress in utilizing reinforcement learning to enhance the inference capabilities of large language models.

Cost Advantage

In terms of cost, DeepSeek-R1 offers a highly competitive pricing strategy. Its API access is priced at $0.14 per million input tokens (cache hit) and $0.55 per million (cache miss), with output tokens costing $2.19 per million. This pricing strategy is more attractive compared to other similar products and has been described by users as a "game changer." The official website and API are now live! Visit https://chat.deepseek.com to experience DeepThink!

Community Feedback and Future Outlook

The release of DeepSeek-R1 has sparked enthusiastic discussions within the community. Many users appreciate the model's open-source nature and cost advantages, believing it provides developers with more choices and freedom. However, some users have raised questions about the model's context window size, hoping for further optimizations in future versions.

The DeepSeek team has stated that they will continue to focus on improving the model's performance and user experience, while also planning to introduce more features in the future, including advanced data analytics, to meet user expectations for AGI (Artificial General Intelligence).

Meta Announces World's First 1GW+ Power Supercomputer Cluster to Go Live, AI Computing Competition Rises to New Level

Meta accelerates AI infrastructure, targeting a 1GW 'Prometheus' supercomputer with 1.3M NVIDIA H100 GPUs (2 exaflops) by 2026, plus 5GW 'Hyperion' cluster. Plans $60-65B investment by 2025 for AI/data centers, competing with OpenAI/xAI. Commits to open-source and privacy despite environmental concerns.....

OpenAI's Acquisition of Windsurf Fails, Google Successfully Poaches the CEO and Core Team

OpenAI's acquisition of Windsurf failed, and Google successfully poached its CEO and core team to join DeepMind. Windsurf, formerly known as Codeium, had raised $200 million in funding and had a valuation of $1.25 billion. Google obtained non-exclusive licensing rights to some of its technology but did not acquire the company. Recently, the popularity of code-generation startups has declined, facing competition from large models like Anthropic and Google. Windsurf will be taken over by its former executives and will continue to operate independently.

Dozens of Works Gain Hundreds of Thousands of Followers Teach You How to Use DeepSeek + Jiameng APP to Create Story Picture Book Short Videos for Monetization

【140-word summary】This tutorial introduces the monetization methods of using AI tools to mass-produce pet story picture book videos. Using DeepSeek to generate story scripts, Jiameng APP to create dynamic videos, and CapCut for post-processing, then publishing them on short video platforms. Monetization methods include traffic sharing, selling pet supplies, and training students. Suitable for short video creators, pet lovers, and those looking for side jobs. The operation threshold is low. Case studies show that a single video can receive tens of thousands of likes. The combination of tools reduces the difficulty of content creation, forming a complete business cycle.

OpenAI Delays First Open-Source Large Model Release, Ensuring Safety Becomes Top Priority

OpenAI announced the postponement of its first open-source large model release, with CEO Sam Altman stating that more time is needed for safety testing and risk assessment. This new model, which has performance comparable to o3-mini, may be named 'Open Model,' but the extent of its openness remains unclear. Research Vice President Aidan Clark emphasized that the company maintains strict standards for open source, as the model cannot be recalled once released. Although the delay disappointed some users, OpenAI believes ensuring safety and taking a responsible approach is more important. This decision will shape the future of models.

OpenAI Postpones Open-Source Large Model Release, Prioritizes Safety Testing

OpenAI announced the postponement of the open-source large model release. CEO Sam Altman stated that more time is needed for safety testing. The model was originally scheduled to be released this week but is now delayed until next week to ensure its safety and reliability. Altman emphasized that once the model is released, it cannot be recalled and must be handled with caution. This is OpenAI's first attempt to release a downloadable self-running model, aimed at providing powerful tools for researchers and small businesses. Although the delay is disappointing, the community generally understands the importance of safety testing and believes it is crucial for the AI ecosystem.

Product Finder

Product Submit

AI Models Finder

MCP Servers

MCP Client

MCP Inspector

Case Tutorials

Latest AI News

AI Daily Brief

DeepSeek Launches New Inference Model DeepSeek-R1 Comparable to OpenAI-o1

AIbase基地

Cost Advantage

Community Feedback and Future Outlook

This article is from AIbase Daily

AI News Recommendations

Meta Announces World's First 1GW+ Power Supercomputer Cluster to Go Live, AI Computing Competition Rises to New Level

OpenAI's Acquisition of Windsurf Fails, Google Successfully Poaches the CEO and Core Team

Google Gemini Embedding Model Tops MTEB Ranking, Surpassing OpenAI

Dozens of Works Gain Hundreds of Thousands of Followers Teach You How to Use DeepSeek + Jiameng APP to Create Story Picture Book Short Videos for Monetization

OpenAI Delays First Open-Source Large Model Release, Ensuring Safety Becomes Top Priority

OpenAI Postpones Open-Source Large Model Release, Prioritizes Safety Testing

SpaceX invests $2 billion to help xAI accelerate the追赶 of OpenAI

OpenAI Subtly Adds Shopify as a Search Partner, Strengthening ChatGPT Shopping Search Functionality

OpenAI Plans to Release Open-Weight Models, Breaking the Closed-Source Convention

AI API Showdown in the First Half of 2025: Gemini Dominates, DeepSeek Makes a Surprise Rise, Why Did OpenAI Fall Behind?