DeepSeek-V3, a Chinese AI Dark Horse, Makes a Stunning Debut: 20 Tokens/Second Speed

DeepSeek-V3, a Chinese AI Dark Horse, Makes a Stunning Debut: 20 Tokens/Second Speed—A Game Changer?

AIbase基地

Published inAI News · 6 min read · Mar 25, 2025

The Chinese AI startup, DeepSeek, has quietly released its large language model, DeepSeek-V3-0324, sending ripples through the AI industry. The model, weighing in at a substantial 641GB, appeared on the AI resource repository Hugging Face. This release continues DeepSeek's understated yet impactful style, with no fanfare, only an empty README file and the model weights.

Licensed under MIT, the model is free for commercial use and can run directly on consumer-grade hardware – an Apple Mac Studio equipped with an M3 Ultra chip. AI researcher Awni Hannun revealed on social media that the 4-bit quantized version of DeepSeek-V3-0324 runs at over 20 tokens per second on a 512GB M3 Ultra chip. While the Mac Studio is expensive, the ability to run such a large model locally breaks the previous reliance of top-tier AI on data centers.

DeepSeek

DeepSeek-V3-0324 utilizes a Mixture-of-Experts (MoE) architecture, activating only about 37 billion parameters during task execution, instead of all 685 billion, significantly improving efficiency. It also incorporates Multi-Head Latent Attention (MLA) and Multi-Token Prediction (MTP) technologies. MLA enhances the model's contextual understanding in long texts, while MTP allows the model to generate multiple tokens at a time, increasing output speed by nearly 80%. The 4-bit quantized version reduces storage requirements to 352GB, making it feasible to run on high-end consumer hardware.

Early testers reported significant improvements over the previous version. AI researcher Xeophon claims the model shows massive leaps across all tested metrics, surpassing Anthropic's Claude Sonnet 3.5 to become the best non-reasoning model. Unlike the subscription-based Sonnet, DeepSeek-V3-0324's weights are freely available for download.

DeepSeek's open-source release strategy contrasts sharply with Western AI companies. American companies like OpenAI and Anthropic impose paywalls on their models, while Chinese AI companies increasingly favor permissive open-source licenses. This strategy accelerates the development of China's AI ecosystem, with tech giants like Baidu, Alibaba, and Tencent following suit by releasing open-source AI models. Faced with Nvidia chip restrictions, Chinese companies are turning disadvantages into competitive advantages by emphasizing efficiency and optimization.

DeepSeek-V3-0324 is likely the foundation for the upcoming DeepSeek-R2 inference model. Current inference models have enormous computational demands. If DeepSeek-R2 performs well, it will pose a direct challenge to OpenAI's rumored GPT-5.

Users and developers wishing to experience DeepSeek-V3-0324 can download the complete model weights from Hugging Face, but the large file size requires substantial storage and computing resources. Cloud services, such as OpenRouter, offer free API access and a user-friendly chat interface; DeepSeek's own chat interface may also be updated to support the new version. Developers can also integrate the model through inference service providers like Hyperbolic Labs.

It's worth noting that DeepSeek-V3-0324 has shifted its communication style from a previous human-like conversational style to a more formal and technical one. This change is intended for professional and technical applications but may affect its appeal in consumer-facing applications.

DeepSeek's open-source strategy is reshaping the global AI landscape. Previously, China lagged behind the US by 1-2 years in AI; this gap has now narrowed significantly to 3-6 months, with some areas even surpassing the US. Similar to how Android gained global dominance through open-source, open-source AI models, leveraging widespread adoption and collective innovation from developers, are poised to excel in competition and drive broader AI adoption.

AI Daily: Zhipu AI Opens Sources 32B/9B GLM Series Models and Launches Z.ai Domain; OpenAI Releases GPT-4.1 Series Models; Alibaba ModelScope Launches MCP Plaza

Welcome to the "AI Daily" column! Your daily guide to exploring the world of artificial intelligence. We present you with the hottest AI topics, focusing on developers, helping you understand technology trends and learn about innovative AI product applications. Discover new AI products here: https://top.aibase.com/ 1. Zhipu AI Launches New Domain Z.ai and Open Sources 32B/9B Series GLM Models Zhipu AI team recently announced the open sourcing of 32B and 9B series GLM models and launched a new interactive...

Moon's Dark Side Launches First Content Community, Kimi, to Enhance User Interaction

Moon's Dark Side recently announced it's conducting a gray-scale test of its first content community product, Kimi, aimed at improving user experience and retention. The product, Kimi, underwent limited testing late last year and is now entering a wider testing phase. According to The Paper, Moon's Dark Side is a company founded in March 2023, led by a team headed by Yang Zhilin, who has a background at Tsinghua University. Core members of the founding team have participated in the development of several well-known large language models, including Google's Gemini and Bard.

Hugging Face Acquires Pollen Robotics to Accelerate Open-Source Robotics

Hugging Face, the AI development platform, has announced the acquisition of French robotics startup Pollen Robotics for an undisclosed sum. This marks Hugging Face's first foray into hardware and aims to promote the global adoption and development of open-source robotics. Pollen Robotics, founded in 2016 and based in Bordeaux, France, is known for its open-source humanoid robot, Reachy2. Priced at approximately $70,000, Reachy2 has been adopted by institutions such as Cornell University.

Zhipu AI Launches New Domain Z.ai and Open-Sources 32B/9B GLM Model Series

Zhipu AI's technology team has announced the open-sourcing of its 32B and 9B GLM (General Language Model) model series, and the official launch of its new interactive platform, Z.ai. This model series includes base models, inference models, and contemplative models, all under a permissive MIT license. This grants developers extensive freedom for use and development, allowing free use for commercial purposes and free distribution.

Meta's Llama-4-Maverick Plummets in Rankings, Raising Concerns of Benchmark Manipulation

Meta's open-source large language model, Llama-4-Maverick, has experienced a dramatic drop in LMArena rankings, plummeting from second place to 32nd. This significant shift has sparked widespread skepticism among developers, who suspect Meta may have manipulated the benchmark by submitting a specially optimized version. The issue stems from Meta's April 6th release of its latest large language model, Llama 4, encompassing three versions: Scout, Maverick, and Behemoth.

OpenGVLab Open-Sources InternVL3 Series of Multimodal Large Language Models

OpenGVLab has open-sourced the InternVL3 series of models, marking a new milestone in the field of Multimodal Large Language Models (MLLMs). The InternVL3 series comprises seven models ranging from 1B to 78B parameters, capable of handling text, images, and videos simultaneously, demonstrating superior overall performance.

Stanford Report Confirms: Alibaba's Qwen Ranks Third Globally in Large Model Contribution, Reshaping Global Competition with Computing Power!

Stanford University's AI Index Report 2025 offers a fresh perspective on the global AI landscape. The report highlights Alibaba's significant contribution, ranking third globally among major large language models, establishing it as a leading Chinese tech company. In 2024, China contributed 15 models globally, with Alibaba contributing 6, trailing only Google and OpenAI with 7 models each. This achievement reflects Alibaba's ongoing commitment to technological innovation.

Huawei Noah's Ark Lab and HKU Release Dream 7B, a Powerful Open-Source Diffusion Language Model

A new star shines brightly in the field of artificial intelligence! Recently, Huawei Noah's Ark Lab and the Hong Kong University Natural Language Processing Group (HKU NLP Group) jointly released a new language model called Dream7B. This model is hailed as the most powerful open-source diffusion large language model to date. Dream7B not only surpasses existing diffusion language models in performance, but also rivals or even surpasses top autoregressive (AR) language models of similar scale in general capabilities, mathematics, code, and planning abilities.