MobileLLM-125M
An efficient, optimized small language model designed for device-side applications.
CommonProductProgrammingLanguage ModelDevice-side Applications
MobileLLM-125M is an autoregressive language model developed by Meta, using an optimized transformer architecture specifically designed for resource-constrained device applications. This model integrates several key technologies, such as the SwiGLU activation function, a deep thin architecture, shared embeddings, and grouped query attention. MobileLLM-125M/350M achieved accuracy improvements of 2.7% and 4.3%, respectively, on zero-shot commonsense reasoning tasks compared to the previous 125M/350M state-of-the-art models. The design principles can be effectively scaled to larger models, with MobileLLM-600M/1B/1.5B all achieving state-of-the-art results.
MobileLLM-125M Visit Over Time
Monthly Visits
19075321
Bounce Rate
45.07%
Page per Visit
5.5
Visit Duration
00:05:32