French AI startup Les Ministraux has introduced two new lightweight models, Ministral3B and Ministral8B, designed specifically for edge devices, with 3 billion and 8 billion parameters respectively. These models have performed exceptionally well in instruction-following benchmark tests, with Ministral3B surpassing Llama38B and Mistral7B, and Ministral8B outperforming these models in all aspects except for coding capabilities.
Test results indicate that the performance of Ministral3B and Ministral8B is comparable to that of open-source models like Gemma2 and Llama3.1. Both models support up to 128k context lengths and set new benchmarks in knowledge, common sense, reasoning, function calling, and efficiency for models with less than 10B parameters. Ministral8B also features a sliding window attention mechanism for faster and more efficient memory inference. They can be fine-tuned for various use cases, such as managing complex AI agent workflows or creating specialized task assistants.
Researchers have conducted multiple benchmark tests on Les Ministraux models, covering knowledge and common sense, code, math, and multilingual aspects. During the pre-training phase, Ministral3B outperformed Gema22B and Llama3.23B. Ministral8B, compared to Llama3.18B and Mistral7B, excelled in all aspects except for coding capabilities. In the fine-tuned instruction model phase, Ministral3B achieved the best results in various benchmarks, while Ministral8B slightly lagged behind Gema29B only on the Wild bench.
The introduction of Les Ministraux models provides users with a high computational efficiency and low-latency solution, meeting the growing demand for local-first inference in critical applications. Users can apply these models to scenarios such as on-device translation, offline smart assistants, and autonomous robots. The pricing for Ministral8B is $0.1 per million tokens, and for Ministral3B, it's $0.04 per million tokens.
It's worth noting that Mistral previously gained recognition in the AI community for open-sourcing several models via magnet links. However, the company has faced controversy this year for not being as open as before. There are rumors that Microsoft will acquire a stake in Mistral and invest in it, meaning Mistral's models will be hosted on Azure AI. Reddit users have noticed that Mistral has removed its commitment to open-source from its official website. Some of the company's models have also started to charge, including the newly released Ministral3B and Ministral8B.
Details: https://mistral.ai/news/ministraux/