Skywork-MoE-Base

A high-performance mixed expert (MoE) model with 146 billion parameters

CommonProductProgrammingMixed expert modelLarge scale parameters
Skywork-MoE-Base is a high-performance mixed expert (MoE) model with 146 billion parameters, comprising 16 experts and activating 22 billion parameters. The model is initialized from the dense checkpoint of the Skywork-13B model and introduces two innovative techniques: gated logical normalization enhances expert diversity, and an adaptive auxiliary loss coefficient allows for layer-specific adjustment of the auxiliary loss coefficient. Skywork-MoE exhibits comparable or superior performance to models with more parameters or activation parameters on various popular benchmark tests.
Visit

Skywork-MoE-Base Visit Over Time

Monthly Visits

17788201

Bounce Rate

44.87%

Page per Visit

5.4

Visit Duration

00:05:32

Skywork-MoE-Base Visit Trend

Skywork-MoE-Base Visit Geography

Skywork-MoE-Base Traffic Sources

Skywork-MoE-Base Alternatives