MiniMax-01
A powerful language model with a total of 456 billion parameters, capable of processing context lengths of up to 4 million tokens.
CommonProductProgrammingArtificial IntelligenceLanguage Model
MiniMax-01 is a robust language model with a total of 456 billion parameters, where each token activates 45.9 billion parameters. It employs a hybrid architecture that combines lightning attention, softmax attention, and mixture of experts (MoE). Through advanced parallel strategies and innovative computation-communication overlap methods, such as Linear Attention Sequence Parallelism (LASP+), variable-length ring attention, and expert tensor parallelism (ETP), it extends the training context length to 1 million tokens and can process contexts of up to 4 million tokens during inference. MiniMax-01 has demonstrated top-tier model performance across multiple academic benchmarks.
MiniMax-01 Visit Over Time
Monthly Visits
490881889
Bounce Rate
37.92%
Page per Visit
5.6
Visit Duration
00:06:18