MobileLLM-600M
An efficient and optimized 600M parameter language model designed for device applications.
CommonProductProgrammingLanguage ModelTransformer
MobileLLM-600M is an autoregressive language model developed by Meta, employing an optimized Transformer architecture specifically designed for resource-constrained device applications. This model incorporates key technologies such as the SwiGLU activation function, a deep and thin architecture, shared embeddings, and grouped query attention. MobileLLM-600M has shown a significant performance increase in zero-shot common sense reasoning tasks, achieving accuracy improvements of 2.7% and 4.3% compared to previous state-of-the-art models with 125M and 350M parameters, respectively. The design philosophy behind this model can be scaled to larger models, such as MobileLLM-1B and 1.5B, both of which have achieved state-of-the-art results.
MobileLLM-600M Visit Over Time
Monthly Visits
19075321
Bounce Rate
45.07%
Page per Visit
5.5
Visit Duration
00:05:32