The Chinese AI startup, DeepSeek, has quietly released its large language model, DeepSeek-V3-0324, sending ripples through the AI industry. The model, weighing in at a substantial 641GB, appeared on the AI resource repository Hugging Face. This release continues DeepSeek's understated yet impactful style, with no fanfare, only an empty README file and the model weights.
Licensed under MIT, the model is free for commercial use and can run directly on consumer-grade hardware – an Apple Mac Studio equipped with an M3 Ultra chip. AI researcher Awni Hannun revealed on social media that the 4-bit quantized version of DeepSeek-V3-0324 runs at over 20 tokens per second on a 512GB M3 Ultra chip. While the Mac Studio is expensive, the ability to run such a large model locally breaks the previous reliance of top-tier AI on data centers.
DeepSeek-V3-0324 utilizes a Mixture-of-Experts (MoE) architecture, activating only about 37 billion parameters during task execution, instead of all 685 billion, significantly improving efficiency. It also incorporates Multi-Head Latent Attention (MLA) and Multi-Token Prediction (MTP) technologies. MLA enhances the model's contextual understanding in long texts, while MTP allows the model to generate multiple tokens at a time, increasing output speed by nearly 80%. The 4-bit quantized version reduces storage requirements to 352GB, making it feasible to run on high-end consumer hardware.
Early testers reported significant improvements over the previous version. AI researcher Xeophon claims the model shows massive leaps across all tested metrics, surpassing Anthropic's Claude Sonnet 3.5 to become the best non-reasoning model. Unlike the subscription-based Sonnet, DeepSeek-V3-0324's weights are freely available for download.
DeepSeek's open-source release strategy contrasts sharply with Western AI companies. American companies like OpenAI and Anthropic impose paywalls on their models, while Chinese AI companies increasingly favor permissive open-source licenses. This strategy accelerates the development of China's AI ecosystem, with tech giants like Baidu, Alibaba, and Tencent following suit by releasing open-source AI models. Faced with Nvidia chip restrictions, Chinese companies are turning disadvantages into competitive advantages by emphasizing efficiency and optimization.
DeepSeek-V3-0324 is likely the foundation for the upcoming DeepSeek-R2 inference model. Current inference models have enormous computational demands. If DeepSeek-R2 performs well, it will pose a direct challenge to OpenAI's rumored GPT-5.
Users and developers wishing to experience DeepSeek-V3-0324 can download the complete model weights from Hugging Face, but the large file size requires substantial storage and computing resources. Cloud services, such as OpenRouter, offer free API access and a user-friendly chat interface; DeepSeek's own chat interface may also be updated to support the new version. Developers can also integrate the model through inference service providers like Hyperbolic Labs.
It's worth noting that DeepSeek-V3-0324 has shifted its communication style from a previous human-like conversational style to a more formal and technical one. This change is intended for professional and technical applications but may affect its appeal in consumer-facing applications.
DeepSeek's open-source strategy is reshaping the global AI landscape. Previously, China lagged behind the US by 1-2 years in AI; this gap has now narrowed significantly to 3-6 months, with some areas even surpassing the US. Similar to how Android gained global dominance through open-source, open-source AI models, leveraging widespread adoption and collective innovation from developers, are poised to excel in competition and drive broader AI adoption.