Huanfang Quantum Announces the Launch of DeepSeek-V3: Performance Comparable to GPT-4 with Unprecedented Low Training Costs

AIbase基地

Published inAI News · 4 min read · Dec 27, 2024

1.3k

On the evening of December 26, Huanfang launched the new generation of large model DeepSeek-V3, showcasing remarkable technological breakthroughs. This model, which employs a MoE (Mixture of Experts) architecture, not only rivals top closed-source models in performance but also attracts industry attention due to its low-cost and high-efficiency characteristics.

In terms of core parameters, DeepSeek-V3 has 671 billion parameters, with 37 billion being active parameters, and it completed pre-training on a data scale of 14.8 trillion tokens. Compared to its predecessor, the new model's generation speed has increased threefold, processing 60 tokens per second, significantly enhancing practical application efficiency.

In performance evaluations, DeepSeek-V3 has demonstrated exceptional capabilities. It not only surpasses well-known open-source models like Qwen2.5-72B and Llama-3.1-405B but also performs comparably to GPT-4 and Claude-3.5-Sonnet in several tests. Notably, in mathematical ability assessments, this model achieved outstanding results, exceeding all existing open-source and closed-source models.

What stands out the most is DeepSeek-V3's low-cost advantage. According to disclosures in open-source papers, the total training cost of the model is only $5.576 million, calculated at $2 per GPU hour. This groundbreaking achievement is attributed to the collaborative optimization of algorithms, frameworks, and hardware. OpenAI co-founder Karpathy highly praised this, noting that DeepSeek-V3 achieved performance surpassing Llama3 with only 2.8 million GPU hours, improving computational efficiency by approximately 11 times.

In terms of commercialization, while the API service pricing for DeepSeek-V3 has increased compared to the previous generation, it still offers a high cost-performance ratio. The new version is priced at 0.5-2 RMB per million input tokens and 8 RMB per output token, with a total cost of about 10 RMB. In contrast, the equivalent service for GPT-4 costs around 140 RMB, highlighting a significant price difference.

As a fully open-source large model, the release of DeepSeek-V3 not only showcases the advancements in Chinese AI technology but also provides developers and enterprises with a high-performance, low-cost AI solution.

DeepSeek-V3 MoE 6710billionparameters MixtureofExperts

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

World's First Embodied Intelligence Open Platform Launches! 3D Digital Humans Now Ready to Use Out of the Box: Mofa Xingyun Integrates Large Models into Hundreds of Yuan Chips

Mofa Tech launches 'Mofa Nebula', the first 3D digital human platform, enabling AI to generate real-time expressions, gestures, and movements from text via its 3D multimodal engine, compatible with mobile and automotive devices.....

Oct 31, 2025

120

8B Model Outperforms 32B? Mira Murati's New Work in Online Strategic Distillation Sparks an AI Training Revolution, Cost Drops by 90%!

Mira Murati's team introduced online policy distillation, enabling an 8B-parameter model to achieve 70% of a 32B model's performance with 90% lower training costs and 50-100x efficiency gains, making high-performance AI accessible to small developers.....

Oct 30, 2025

140

Breakthrough Low Latency! Cartesia Launches Sonic-3 Voice AI Engine: Ultra Human-like Dialogue with Less Than 100 Milliseconds Delay

Cartesia Company launches the voice AI engine Sonic-3, claiming it as the fastest and most natural real-time dialogue model in the world. It achieves near-zero latency interaction through a new state space model architecture, capable of simulating human emotions, tone, and laughter changes, significantly enhancing the authenticity of communication.

Oct 29, 2025

240

2025 Q3 AI Application Market Status: Mobile User Surpasses 700 Million, Douyin Takes First Place in Monthly Active Users for Native AI Apps

According to QuestMobile's report, mobile AI application users exceeded 700 million in Q3 2025. The monthly active users for native apps, In-APP AI, and mobile AI assistants were 287 million, 706 million, and 535 million respectively, with compound annual growth rates of 3.4%, 9.3%, and 1.2%. The growth is mainly driven by upgrades in manufacturer models and ecosystem collaboration, with internet companies frequently updating their large models.

Oct 29, 2025

140

Ant Group's BaiLing Large Model Team Open Sources Ring-flash-linear-2.0-128K, Combining Hybrid Attention and MoE Architecture to Reshape Long-Text Programming Efficiency

Ant Group open-sources the BaiLing Large Model Ring-flash-linear-2.0-128K, specifically targeting long-text programming. It employs a hybrid linear attention mechanism with a sparse MoE architecture, achieving performance comparable to a 40B dense model by activating only 6.1B parameters. It achieves optimal results in code generation and intelligent agent applications, efficiently addressing the challenges of long context processing.

Oct 28, 2025

150

OpenAI GPT-5 Revolutionary Upgrade in Mental Health Response, Unwanted Answers Drop by 65%

OpenAI launches GPT-5 with enhanced mental health response features, addressing suicide intent expressed by 0.15% of weekly active users (~1 million). Collaborated with 300 experts across 60 countries to optimize support mechanisms.....

Oct 28, 2025

200

DeepSeek Model Wins in Hong Kong and U.S. Stock Trading Competition with an Annualized Return of 10.61%, Far Exceeding GPT and Nasdaq Benchmark

China's DeepSeek model achieved 10.61% annual return in HKU-led AI trading experiment, outperforming GPT models and Nasdaq 100, demonstrating AI's potential in autonomous stock trading.....

Oct 28, 2025

360

Volc Engine Launches Doubao Video Generation Model 1.0 Pro Fast, Speed Increased by 3 Times, Price Reduced by 72%

On Oct 24, Volcano Engine launched Doubao Video Model 1.0pro fast, enhancing efficiency and reducing costs. It generates 720P 5-second videos in just 10 seconds, 3x faster than the pro version, offering efficient solutions for businesses and creators.....

Oct 27, 2025

210

Research Reveals that a Large Amount of Garbage Data Affects the Reasoning Ability of Large Language Models

Study warns: Continuous exposure to meaningless online content may cause significant performance decline in large language models, impairing reasoning and confidence. Proposed 'LLM brain decline hypothesis' likens it to human cognitive damage from excessive low-quality content.....

Oct 27, 2025

140

ByteDance Seed Team Announces the Launch of 3D Generation Large Model Seed 3D 1.0

The ByteDance Seed team recently announced the launch of the 3D generation large model Seed3D1.0, which is capable of generating high-quality, realistic 3D models from a single image in an end-to-end manner, including detailed geometry, realistic textures, and physically based rendering (PBR) materials. This innovative achievement is expected to provide powerful world simulation support for the development of embodied intelligence, addressing bottlenecks in physical interaction capabilities and content diversity in current technologies. During the development process, the Seed team collected and processed a large amount of high-quality 3D data, building a complete three

Oct 23, 2025

900

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services

AI Search Visibility Checker

AI Model Compatibility Checker

AI Deployment Calculator

AI Dataset Collection

Intelligent Document Recognition

Huanfang Quantum Announces the Launch of DeepSeek-V3: Performance Comparable to GPT-4 with Unprecedented Low Training Costs

AIbase基地

This article is from AIbase Daily

AI News Recommendations

World's First Embodied Intelligence Open Platform Launches! 3D Digital Humans Now Ready to Use Out of the Box: Mofa Xingyun Integrates Large Models into Hundreds of Yuan Chips

8B Model Outperforms 32B? Mira Murati's New Work in Online Strategic Distillation Sparks an AI Training Revolution, Cost Drops by 90%!

Breakthrough Low Latency! Cartesia Launches Sonic-3 Voice AI Engine: Ultra Human-like Dialogue with Less Than 100 Milliseconds Delay

2025 Q3 AI Application Market Status: Mobile User Surpasses 700 Million, Douyin Takes First Place in Monthly Active Users for Native AI Apps

Ant Group's BaiLing Large Model Team Open Sources Ring-flash-linear-2.0-128K, Combining Hybrid Attention and MoE Architecture to Reshape Long-Text Programming Efficiency

OpenAI GPT-5 Revolutionary Upgrade in Mental Health Response, Unwanted Answers Drop by 65%

DeepSeek Model Wins in Hong Kong and U.S. Stock Trading Competition with an Annualized Return of 10.61%, Far Exceeding GPT and Nasdaq Benchmark

Volc Engine Launches Doubao Video Generation Model 1.0 Pro Fast, Speed Increased by 3 Times, Price Reduced by 72%

Research Reveals that a Large Amount of Garbage Data Affects the Reasoning Ability of Large Language Models

ByteDance Seed Team Announces the Launch of 3D Generation Large Model Seed 3D 1.0

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Deployment Calculator

AI Dataset Collection

Intelligent Document Recognition

Huanfang Quantum Announces the Launch of DeepSeek-V3: Performance Comparable to GPT-4 with Unprecedented Low Training Costs

AIbase基地

This article is from AIbase Daily

AI News Recommendations

World's First Embodied Intelligence Open Platform Launches! 3D Digital Humans Now Ready to Use Out of the Box: Mofa Xingyun Integrates Large Models into Hundreds of Yuan Chips

8B Model Outperforms 32B? Mira Murati's New Work in Online Strategic Distillation Sparks an AI Training Revolution, Cost Drops by 90%!

Breakthrough Low Latency! Cartesia Launches Sonic-3 Voice AI Engine: Ultra Human-like Dialogue with Less Than 100 Milliseconds Delay

2025 Q3 AI Application Market Status: Mobile User Surpasses 700 Million, Douyin Takes First Place in Monthly Active Users for Native AI Apps

Ant Group's BaiLing Large Model Team Open Sources Ring-flash-linear-2.0-128K, Combining Hybrid Attention and MoE Architecture to Reshape Long-Text Programming Efficiency

OpenAI GPT-5 Revolutionary Upgrade in Mental Health Response, Unwanted Answers Drop by 65%

DeepSeek Model Wins in Hong Kong and U.S. Stock Trading Competition with an Annualized Return of 10.61%, Far Exceeding GPT and Nasdaq Benchmark

Volc Engine Launches Doubao Video Generation Model 1.0 Pro Fast, Speed Increased by 3 Times, Price Reduced by 72%

Research Reveals that a Large Amount of Garbage Data Affects the Reasoning Ability of Large Language Models

ByteDance Seed Team Announces the Launch of 3D Generation Large Model Seed 3D 1.0

GEO Services