The highly anticipated Deepseek V3 has finally been open-sourced! This brand new AI model has made significant breakthroughs in multilingual programming capabilities, outperforming competitors like Claude 3.5 and Sonnet V2 in the Aider multilingual programming evaluation, attracting widespread attention in the industry.
It is reported that Deepseek V3 has achieved a qualitative leap in performance compared to previous versions. Deepseek V2.5 had a success rate of only 17% in the Aider evaluation, while V3 has surged to 48%, showcasing its remarkable advancements.
Deepseek V3 utilizes a mixture of experts (MoE) architecture with up to 685 billion parameters. This architecture consists of 256 experts and employs a sigmoid routing method, selecting the top 8 experts (topk=8) for computation each time. This design allows the model to handle complex tasks more efficiently and enhances its performance.
The open-sourcing of Deepseek V3 will undoubtedly bring new vitality to the AI community. Its powerful programming capabilities are expected to play a significant role in software development, automation, and other fields, injecting new momentum into the intelligent upgrade of various industries.
Address: https://huggingface.co/deepseek-ai/DeepSeek-V3-Base/tree/main