Mistral AI recently announced the launch of its latest generation large language model, Mistral Large2, which has achieved significant breakthroughs in cost-effectiveness, speed, and performance.

Model Overview

Mistral Large2 is a model with 123 billion parameters and a 128K context window. It supports dozens of languages including English, French, German, Spanish, Italian, Portuguese, Arabic, Hindi, Russian, Chinese, Japanese, and Korean, as well as over 80 programming languages such as Python, Java, C, C++, JavaScript, and Bash.

Performance Highlights

QQ_1721867063415.png

General Performance: In the MMLU test, the pre-trained version of Mistral Large2 achieved an accuracy of 84.0%.

QQ_1721867086970.png

QQ_1721867110778.png

Code and Reasoning Capabilities: Mistral Large2 performs on par with leading models like GPT-4, Claude3Opus, and Llama3405B in code generation and mathematical reasoning.

QQ_1721867130761.png

Multilingual Capabilities: In the multilingual MMLU benchmark, Mistral Large2 demonstrates excellent multilingual processing abilities, especially in major languages like English, French, and German.

Instruction Following and Alignment: In benchmarks such as MT-Bench, Wild Bench, and Arena Hard, Mistral Large2 significantly enhances instruction following and conversational abilities.

Tool Usage and Function Calling: The model is trained to adeptly perform parallel and sequential function calls, providing robust support for complex business applications.

Technical Features

Significantly reduces "hallucination" phenomena, enhancing the reliability and accuracy of outputs.

Enhanced self-awareness capabilities when solutions cannot be found or information is insufficient.

Focuses on generating concise and to-the-point responses, improving interaction efficiency and cost-effectiveness.

Applications and Availability

Mistral Large2 is now available on la Plateforme under the name "mistral-large-2407".

Model weights are open and hosted on HuggingFace.

Mistral AI has expanded its partnership with Google Cloud Platform to provide Managed API services through Vertex AI.

The model can also be accessed through cloud service providers such as Azure AI Studio, Amazon Bedrock, and IBM watsonx.ai.

Licensing and Usage Terms

Mistral Large2 is released under the Mistral Research License, allowing for research and non-commercial use. Commercial use requires obtaining the Mistral Commercial License.

The release of this new model marks a significant advancement for Mistral AI in the field of large language models, providing developers with more powerful and flexible tools, and有望推动各行各业的创新应用.