Qwen1.5-MoE-A2.7B

A large-scale MoE (Mixture of Experts) language model whose performance rivals that of 70 billion parameter models.

EditorRecommendationProgrammingNatural Language ProcessingLarge Model
Qwen1.5-MoE-A2.7B is a large-scale MoE (Mixture of Experts) language model with only 2.7 billion activation parameters. Despite its smaller size, it achieves performance comparable to 70 billion parameter models. Compared to traditional large models, this model reduces training costs by 75% and increases inference speed by 1.74 times. It employs a special MoE architecture design, including fine-grained experts, new initialization methods, and routing mechanisms, which significantly enhance model efficiency. This model is suitable for various tasks in natural language processing, code generation, and more.
Visit

Qwen1.5-MoE-A2.7B Visit Over Time

Monthly Visits

331351

Bounce Rate

58.83%

Page per Visit

2.0

Visit Duration

00:01:36

Qwen1.5-MoE-A2.7B Visit Trend

Qwen1.5-MoE-A2.7B Visit Geography

Qwen1.5-MoE-A2.7B Traffic Sources

Qwen1.5-MoE-A2.7B Alternatives