MiniMax-01

A powerful language model with a total of 456 billion parameters, capable of processing context lengths of up to 4 million tokens.

CommonProductProgrammingArtificial IntelligenceLanguage Model
MiniMax-01 is a robust language model with a total of 456 billion parameters, where each token activates 45.9 billion parameters. It employs a hybrid architecture that combines lightning attention, softmax attention, and mixture of experts (MoE). Through advanced parallel strategies and innovative computation-communication overlap methods, such as Linear Attention Sequence Parallelism (LASP+), variable-length ring attention, and expert tensor parallelism (ETP), it extends the training context length to 1 million tokens and can process contexts of up to 4 million tokens during inference. MiniMax-01 has demonstrated top-tier model performance across multiple academic benchmarks.
Visit

MiniMax-01 Visit Over Time

Monthly Visits

490881889

Bounce Rate

37.92%

Page per Visit

5.6

Visit Duration

00:06:18

MiniMax-01 Visit Trend

MiniMax-01 Visit Geography

MiniMax-01 Traffic Sources

MiniMax-01 Alternatives